Introduction¶
The United States stock market is one of the oldest, largest, and most lucrative markets in the world currently. The overall size of the US market is $25.564 trillion, hosting over 2800 publicly traded companies (which are companies that have already gone through an IPO or Initial Public Offering event). More about IPO’s and how they work with the market can be found here. Because of the volume which the US market has in trades per day, we use multiple index tractors to increase the expected return of a portfolio while minimizing the overall risk of trading in the market. The US market poses multiple index tractors, however, the main 3 are Dow Jones, NASDAQ, and S&P 500. These tickers are given more numerical information (data) by these companies which provide a more comprehensive market view which is weighted by market capitalization. Additionally, we have the NYSE ticker which shows the overall process, growth, and whether the New York Stock Exchange is up or down. We focus on the New York Stock Exchange since the entire US market is based there on Wall Street.
Our analysis focuses on the Standard and Poor's 500 (S&P 500), which is a stock market index, tracking the stock performance of 500 of the largest companies listed on stock exchanges in the United States. These stocks usually consist of blue chip stocks which will almost always have upside potential for investors (examples of companies under the S&P 500: Apple–AAPL, Amazon–AMZN, Chipotle–CMG, etc…). We chose the S&P 500 because of the different types of stocks listed within the index. These stocks operate in multiple different sectors, and a majority have been around for a while providing us with ample data to work with. Additionally, these are some of the most looked at stocks by the general public, large investors (such as Warren Buffet), and larger banks and hedge funds (e.g. JPMC). An article talking about the S&P 500 can be found here.
With modern technology, we as consumers are able to look at data on our own and see which categories of the market would go up. There are plenty of factors which can affect a market's outcome such as economic factors, sudden events such as a pandemic, or even just shifts from word of mouth creating large trade volumes. However, some of these have only short term effects on the market growth which is why looking at economic indicators is one of the ways we can track previous growth of the market (or different sectors in the market) and use it as a way of predicting future outcomes. Major economic indicators which trigger long term shifts in the market include GDP, Unemployment Rates, Inflation, Interest Rates, and plenty more. The market can be thought of as something which is affected by multiple factors, however some affect it more than others, and sometimes data for these indicators isn’t as easily available. This is why we focused on the GDP and Unemployment Rates. Using these indicators for predicting future economic and market conditions is important because it allows the general public to buy and sell, but also gives us a relative idea of if we need to make a shift in our economy early. More about economic indicators can be found here.
Of course, predictions can be affected by the inaccuracy of a model, or old data, or sudden large scale events which trigger long term market shifts, however, we can still use these as a baseline to plan for our futures as normal people investing and as an indicator for overall economic condition.
That is why we decided to focus on this topic. Our background from an early age with investing in the stock market, along with our curiosity for looking deeper as seeing how shifts occur from a few economic indicators gave us motivation to pursue this topic.
import warnings
warnings.filterwarnings('ignore')
#Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import requests
from bs4 import BeautifulSoup as bs
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import SGDRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report
import matplotlib.ticker as ticker
import seaborn as sns
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
Data Collection and Data Processing¶
Wikipedia¶
In order to have all sector data (company general industry), clean tickers, and securities(company names) in a final data set, we scraped from a wikipedia table found here. We scraped this wikipedia page to obtain those 3 main points mentioned above since our dataset which included all the stock data was in text form without those key ways to identify how we would use the data later in the project.
#wiki data of stocks
req = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
root = bs(req.content, "html")
wiki_df = pd.read_html(root.find('table').prettify())[0]
stocks_df = wiki_df[["Symbol", "Security", "GICS Sector"]] #cleaning the original data pulled for only relevant columns
stocks_df.head(20)
| Symbol | Security | GICS Sector | |
|---|---|---|---|
| 0 | MMM | 3M | Industrials |
| 1 | AOS | A. O. Smith | Industrials |
| 2 | ABT | Abbott | Health Care |
| 3 | ABBV | AbbVie | Health Care |
| 4 | ACN | Accenture | Information Technology |
| 5 | ADBE | Adobe Inc. | Information Technology |
| 6 | AMD | Advanced Micro Devices | Information Technology |
| 7 | AES | AES Corporation | Utilities |
| 8 | AFL | Aflac | Financials |
| 9 | A | Agilent Technologies | Health Care |
| 10 | APD | Air Products and Chemicals | Materials |
| 11 | ABNB | Airbnb | Consumer Discretionary |
| 12 | AKAM | Akamai | Information Technology |
| 13 | ALB | Albemarle Corporation | Materials |
| 14 | ARE | Alexandria Real Estate Equities | Real Estate |
| 15 | ALGN | Align Technology | Health Care |
| 16 | ALLE | Allegion | Industrials |
| 17 | LNT | Alliant Energy | Utilities |
| 18 | ALL | Allstate | Financials |
| 19 | GOOGL | Alphabet Inc. (Class A) | Communication Services |
Kaggle¶
Next is the most computationally expensive part of this tutorial. Each of the 7000+ stocks found in this data set store their data in a unique text file (all found at https://www.kaggle.com/datasets/borismarjanovic/price-volume-data-for-all-us-stocks-etfs) , which will make collecting this data very CPU heavy. Typically, we would not be able to efficiently create and test a tutorial with an initial data set this large and would be forced to truncate the data we work with. However, we had access to a more powerful machine than a basic PC and therefore could keep our scope on the entire S&P 500 rather than a shortened version. Our data included all the data for every day of the stock's life from 1952 to 2017. This meant that within a calendar year we were looking at well over 1000 values per year since each day recorded had 7 different entries which were date, daily open value, daily high value, daily low value, daily close value, daily volume traded, and openInt. This collection process is performed by a loop that runs through every ticker extracted from the wikipedia table and formats it to the expected text file name format in the data set (i.e ticker.us.txt). For our purposes we needed the dates, high, low, and volume traded per day. These initial raw data points let us clean our data to align it with our economic indicators and eventually merge the dataset with the wikipedia web scraping step. As a bit of preprocessing, after every available stock’s data is collected, the unnecessary columns are dropped and the highs and lows are averaged by fiscal quarter and placed into the data frame. To do so we looked at the yearly quarter changes and then took all the values in there as our quarter amounts per year. We did this for a few reasons.
1. One was the large amount of data would have been too much to continually process and would have made our finding skewed or inaccurate.
2. Our economic indicators we chose (talked more about below) are tracked in quarterly periods so we changed the stock data to quarterly to limit the skew of daily data being compared to quarterly data.
#kaggle dataset collecting and processing
import csv
import os
stock_files = os.listdir("Stocks")
quarters = ["01-01", "04-01", "07-01", "10-01", "12-31"]
years = list(range(2013, 2018))
sp500_tickers = list(wiki_df["Symbol"])
kaggle_df = pd.DataFrame()
for ticker in sp500_tickers: #going through all the sp500 tickers
filename = f"{ticker.lower()}.us.txt"
if filename not in stock_files:
continue
#print(f"Processing {filename}")
with open("Stocks/" + filename, "r") as daily_data: #opening the file for each tickers data
data_reader = csv.reader(daily_data)
i = 0
cols = next(data_reader)
df = pd.DataFrame(columns=cols)
target_year = 2013
line = next(data_reader)
year = int(line[0][0:4])
if year > target_year:
continue
while year != target_year:
line = next(data_reader)
year = int(line[0][0:4])
df.loc[-1] = line
df.index += 1
for line in data_reader:
df.loc[-1] = line
df.index += 1
for c in cols[1:-1]:
df[c] = df[c].astype(float)
df.reset_index(inplace=True)
df.drop(["OpenInt", "Open", "Close", "index"], axis=1, inplace=True)
row = {"Symbol":ticker}
# Averaging quarterly data
for year in years:
for i in range(len(quarters) - 1):
begin = f"{year}-{quarters[i]}"
end = f"{year}-{quarters[i + 1]}"
quarter = f"{year} Q{i + 1}"
quarterly_df = df[(df["Date"] >= begin) & (df["Date"] < end)]
quarterly_df = quarterly_df[["High","Low","Volume"]].mean()
quarterly_df.rename({"High": f"{quarter} Mean High", "Low": f"{quarter} Mean Low", "Volume": f"{quarter} Mean Volume"}, inplace=True)
for k, v in dict(quarterly_df).items():
row[k] = v
row = pd.DataFrame(pd.Series(row)).T
kaggle_df = pd.concat([kaggle_df, row])
kaggle_df.head(30)
| Symbol | 2013 Q1 Mean High | 2013 Q1 Mean Low | 2013 Q1 Mean Volume | 2013 Q2 Mean High | 2013 Q2 Mean Low | 2013 Q2 Mean Volume | 2013 Q3 Mean High | 2013 Q3 Mean Low | 2013 Q3 Mean Volume | 2013 Q4 Mean High | 2013 Q4 Mean Low | 2013 Q4 Mean Volume | 2014 Q1 Mean High | 2014 Q1 Mean Low | 2014 Q1 Mean Volume | 2014 Q2 Mean High | 2014 Q2 Mean Low | 2014 Q2 Mean Volume | 2014 Q3 Mean High | 2014 Q3 Mean Low | 2014 Q3 Mean Volume | 2014 Q4 Mean High | 2014 Q4 Mean Low | 2014 Q4 Mean Volume | 2015 Q1 Mean High | 2015 Q1 Mean Low | 2015 Q1 Mean Volume | 2015 Q2 Mean High | 2015 Q2 Mean Low | 2015 Q2 Mean Volume | 2015 Q3 Mean High | 2015 Q3 Mean Low | 2015 Q3 Mean Volume | 2015 Q4 Mean High | 2015 Q4 Mean Low | 2015 Q4 Mean Volume | 2016 Q1 Mean High | 2016 Q1 Mean Low | 2016 Q1 Mean Volume | 2016 Q2 Mean High | 2016 Q2 Mean Low | 2016 Q2 Mean Volume | 2016 Q3 Mean High | 2016 Q3 Mean Low | 2016 Q3 Mean Volume | 2016 Q4 Mean High | 2016 Q4 Mean Low | 2016 Q4 Mean Volume | 2017 Q1 Mean High | 2017 Q1 Mean Low | 2017 Q1 Mean Volume | 2017 Q2 Mean High | 2017 Q2 Mean Low | 2017 Q2 Mean Volume | 2017 Q3 Mean High | 2017 Q3 Mean Low | 2017 Q3 Mean Volume | 2017 Q4 Mean High | 2017 Q4 Mean Low | 2017 Q4 Mean Volume | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | MMM | 91.312167 | 90.39165 | 3000908.633333 | 98.020156 | 96.832453 | 3184629.796875 | 105.455313 | 104.375 | 2593571.890625 | 115.978413 | 114.62381 | 3003412.984127 | 122.193607 | 120.529344 | 3286431.163934 | 129.457937 | 128.161111 | 2362018.47619 | 133.417656 | 132.191094 | 2302554.203125 | 143.35873 | 141.33619 | 3041470.253968 | 155.016885 | 152.888197 | 2458525.803279 | 151.794762 | 150.024921 | 2440807.730159 | 140.275156 | 138.233281 | 2956910.703125 | 147.416667 | 145.417778 | 2596462.174603 | 148.371311 | 146.186393 | 2821154.639344 | 163.714844 | 161.959375 | 1975933.640625 | 174.425313 | 172.915781 | 1730216.75 | 169.834127 | 167.954603 | 2021322.095238 | 181.455806 | 179.752903 | 1989174.451613 | 199.043492 | 197.367778 | 1849468.174603 | 207.525206 | 205.418159 | 1684750.68254 | 225.0516 | 222.543667 | 1925851.566667 |
| 0 | AOS | 16.693183 | 16.394633 | 1580067.933333 | 17.946859 | 17.548625 | 1883052.6875 | 20.077312 | 19.714828 | 1230033.8125 | 24.618 | 24.195079 | 1645559.492063 | 23.568213 | 23.127361 | 1678852.622951 | 23.258302 | 22.849127 | 1349398.031746 | 23.570563 | 23.206109 | 1015275.21875 | 25.370365 | 24.94981 | 999006.349206 | 29.686967 | 29.156984 | 1118223.540984 | 33.707444 | 33.210063 | 1020983.460317 | 33.785922 | 32.879422 | 1674719.625 | 37.247635 | 36.402302 | 1528875.650794 | 34.733787 | 33.713066 | 1840718.47541 | 40.014906 | 39.273094 | 1472523.25 | 46.386844 | 45.709766 | 1067305.671875 | 48.150698 | 47.271032 | 1280703.47619 | 49.546 | 48.752032 | 919397.596774 | 54.049048 | 53.272238 | 869093.190476 | 56.258095 | 55.529968 | 1342885.873016 | 60.550513 | 59.777517 | 862284.733333 |
| 0 | ABT | 30.724817 | 30.309883 | 12451680.116667 | 33.482594 | 32.897703 | 7951001.546875 | 32.113391 | 31.673766 | 8005789.8125 | 33.834984 | 33.331286 | 7459989.047619 | 35.685213 | 35.102541 | 9174882.016393 | 36.46273 | 35.985317 | 6591339.587302 | 39.491984 | 39.013391 | 5035284.0 | 40.963206 | 40.289476 | 5394024.698413 | 43.385066 | 42.705344 | 5984359.327869 | 45.528857 | 44.942651 | 4981663.079365 | 45.127313 | 44.260953 | 7043142.28125 | 42.363111 | 41.620952 | 6665399.507937 | 38.360328 | 37.510508 | 8101788.295082 | 38.6665 | 38.062969 | 10643153.84375 | 41.879109 | 41.256594 | 9773374.203125 | 39.127889 | 38.410063 | 8882569.047619 | 42.683161 | 42.123548 | 8615529.145161 | 45.023984 | 44.466397 | 6880527.587302 | 50.187952 | 49.569937 | 5433541.793651 | 55.202797 | 54.498413 | 5491270.8 |
| 0 | ABBV | 31.711379 | 30.96831 | 10897951.293103 | 37.874344 | 36.874609 | 7384147.078125 | 38.554094 | 37.840172 | 4690873.125 | 43.141349 | 42.268556 | 6319228.111111 | 44.859967 | 43.921311 | 7609473.032787 | 46.878524 | 45.901365 | 6642542.222222 | 49.877344 | 48.994797 | 13670690.390625 | 57.269476 | 55.990286 | 13088959.507937 | 55.40041 | 54.023197 | 13034415.47541 | 60.105444 | 59.026444 | 12052673.666667 | 60.615578 | 59.162406 | 10942893.65625 | 54.219746 | 52.613365 | 9892407.507937 | 53.032361 | 51.245393 | 9461727.672131 | 58.174437 | 56.981422 | 9176243.015625 | 62.272141 | 61.371672 | 7220892.84375 | 59.454603 | 58.236333 | 7871293.349206 | 61.587597 | 60.726435 | 7046768.645161 | 66.49673 | 65.606238 | 5782878.920635 | 76.087587 | 74.74454 | 6246636.222222 | 93.05493 | 90.88035 | 6171383.966667 |
| 0 | ACN | 65.413133 | 64.46965 | 2958136.65 | 71.669828 | 70.472016 | 3545633.890625 | 66.856953 | 65.881375 | 3753502.28125 | 68.918381 | 67.920349 | 3565505.31746 | 75.105098 | 73.927754 | 3385196.557377 | 74.092238 | 73.077921 | 3025692.761905 | 74.035734 | 73.190891 | 2751022.140625 | 77.558571 | 76.471746 | 3074781.619048 | 83.717639 | 82.393557 | 2900945.442623 | 90.404254 | 89.257111 | 2328701.730159 | 94.127406 | 92.481 | 2749655.25 | 100.857413 | 99.433635 | 2597977.174603 | 98.467754 | 96.567508 | 3563175.213115 | 111.829062 | 110.387188 | 2252809.296875 | 110.0425 | 108.5325 | 2355634.875 | 116.823968 | 115.13 | 2508397.920635 | 117.692258 | 116.198548 | 2577157.33871 | 121.100476 | 119.760794 | 2231790.698413 | 129.643016 | 128.244762 | 1905023.0 | 140.0467 | 138.514 | 1750601.6 |
| 0 | ADBE | 39.7829 | 39.11015 | 3569916.466667 | 44.708719 | 43.920484 | 3709883.078125 | 48.072375 | 47.381406 | 3170725.46875 | 55.64381 | 54.627508 | 3985544.47619 | 64.792295 | 63.390164 | 3924305.557377 | 65.134952 | 63.753222 | 3614820.587302 | 71.433422 | 70.293672 | 2636903.1875 | 70.770476 | 69.400349 | 3332754.111111 | 75.196279 | 73.791721 | 2366642.131148 | 78.606508 | 77.328 | 2462360.793651 | 81.957266 | 80.139375 | 3419304.015625 | 90.60054 | 88.735 | 2952103.492063 | 87.827552 | 85.323174 | 3681589.95082 | 96.7314 | 95.154775 | 2111794.109375 | 100.877748 | 99.587905 | 1807114.6875 | 106.782841 | 105.022873 | 2293033.888889 | 117.874855 | 116.368194 | 2306142.419355 | 137.783524 | 135.854397 | 2570180.857143 | 150.173397 | 148.012063 | 1972602.984127 | 166.874633 | 163.9689 | 3072533.633333 |
| 0 | AMD | 2.672167 | 2.564667 | 18729917.45 | 3.521875 | 3.371563 | 28015006.546875 | 3.874219 | 3.744844 | 29267395.953125 | 3.633492 | 3.531429 | 24142526.84127 | 3.882131 | 3.758033 | 25269923.836066 | 4.116508 | 3.99381 | 21170687.333333 | 4.112656 | 3.992812 | 29876772.28125 | 2.809206 | 2.703492 | 18739877.619048 | 2.868279 | 2.74918 | 17290156.918033 | 2.463016 | 2.385556 | 14835027.238095 | 1.936641 | 1.833281 | 13248920.109375 | 2.268175 | 2.167619 | 9464912.730159 | 2.35041 | 2.198033 | 14099698.442623 | 3.985508 | 3.764922 | 22684549.421875 | 6.523773 | 6.237422 | 35252396.84375 | 8.582698 | 8.168543 | 46450313.365079 | 12.959565 | 12.396373 | 64543556.741935 | 12.506833 | 11.939441 | 76875665.444444 | 13.323238 | 12.81497 | 64954818.825397 | 12.990133 | 12.568997 | 61987566.333333 |
| 0 | AES | 10.10842 | 9.914168 | 7043960.75 | 11.204391 | 10.953047 | 6695538.890625 | 11.321672 | 11.107187 | 4528306.390625 | 12.618524 | 12.345254 | 6451758.952381 | 12.553852 | 12.295115 | 6094453.245902 | 12.804984 | 12.569683 | 6143209.666667 | 13.290375 | 13.076266 | 4156436.75 | 12.298159 | 11.993286 | 6739359.666667 | 11.363377 | 11.076623 | 7149936.786885 | 12.214016 | 11.966571 | 6735586.222222 | 11.136202 | 10.871042 | 6675684.125 | 9.324605 | 9.074429 | 6452733.015873 | 9.368844 | 9.050907 | 6921824.786885 | 10.665641 | 10.432906 | 5180647.8125 | 11.885609 | 11.656453 | 4655622.859375 | 11.352571 | 11.085476 | 5835446.761905 | 11.157871 | 10.906742 | 5824308.983871 | 11.298889 | 11.09727 | 5616984.539683 | 11.091317 | 10.909143 | 4938647.380952 | 10.9345 | 10.755767 | 5099342.833333 |
| 0 | AFL | 46.2141 | 45.433267 | 4104190.6 | 49.176641 | 48.289062 | 3435438.578125 | 54.800359 | 54.012328 | 2189281.828125 | 59.970825 | 59.241762 | 2141384.761905 | 58.619787 | 57.822098 | 2175752.688525 | 57.771492 | 57.017746 | 1926278.460317 | 56.681 | 56.033297 | 2120122.046875 | 55.331175 | 54.628984 | 2444321.063492 | 57.613 | 56.80082 | 2244606.655738 | 60.131063 | 59.417365 | 2073244.571429 | 57.887156 | 56.950672 | 2387365.484375 | 59.969111 | 59.055286 | 2358668.0 | 57.645623 | 56.610869 | 2563169.836066 | 66.453531 | 65.644219 | 2310928.90625 | 71.264313 | 70.539875 | 1739223.609375 | 69.491175 | 68.60873 | 2020415.47619 | 70.262032 | 69.446823 | 2214819.548387 | 74.90873 | 74.134111 | 1846555.492063 | 80.695032 | 79.83694 | 1541004.079365 | 84.0718 | 83.197033 | 1440761.533333 |
| 0 | A | 29.667933 | 29.165533 | 5100460.5 | 30.035625 | 29.418578 | 5595136.578125 | 32.569641 | 32.047453 | 3832068.15625 | 36.757286 | 36.169667 | 3353795.761905 | 40.050869 | 39.277967 | 4170020.704918 | 39.052714 | 38.360952 | 2721890.15873 | 39.777484 | 39.240562 | 2585435.890625 | 39.484873 | 38.748032 | 3063606.825397 | 39.669721 | 38.975672 | 2532460.409836 | 40.752206 | 40.151476 | 2448139.761905 | 37.349875 | 36.627813 | 2521089.796875 | 38.163619 | 37.420571 | 3004962.285714 | 37.845246 | 36.91459 | 2357458.967213 | 42.781375 | 42.130453 | 2219362.46875 | 46.398047 | 45.748719 | 1877364.125 | 45.384619 | 44.604825 | 1990839.174603 | 50.474952 | 49.724629 | 2029887.822581 | 57.100619 | 56.334349 | 2009349.047619 | 62.56181 | 61.779984 | 1643051.47619 | 67.413667 | 66.701367 | 1397107.4 |
| 0 | APD | 77.760883 | 76.798833 | 1093138.95 | 81.289891 | 80.249953 | 1552260.046875 | 92.838203 | 91.196234 | 2046325.296875 | 98.905492 | 97.454413 | 1143625.253968 | 104.973115 | 103.321607 | 1455185.934426 | 110.916032 | 109.299365 | 1199132.857143 | 122.2025 | 120.556875 | 1628685.890625 | 127.274921 | 125.009206 | 1710348.936508 | 139.970164 | 137.682459 | 1108369.032787 | 138.894444 | 136.88254 | 1078376.428571 | 130.466094 | 127.980312 | 1500850.15625 | 129.770635 | 127.481429 | 1433344.746032 | 125.588852 | 122.712131 | 1905698.672131 | 138.716563 | 136.732812 | 1064761.28125 | 145.268594 | 143.489688 | 1379288.796875 | 137.335556 | 135.394921 | 1452917.984127 | 139.034194 | 137.274355 | 1545373.048387 | 141.140476 | 139.533016 | 1303113.492063 | 146.273016 | 144.856825 | 1410870.507937 | 156.523133 | 155.0267 | 926571.433333 |
| 0 | AKAM | 38.345833 | 37.470117 | 3942939.716667 | 42.344391 | 41.340719 | 2741468.78125 | 47.475531 | 46.609094 | 1724353.609375 | 47.71781 | 46.69873 | 2251961.84127 | 55.955328 | 54.503295 | 2719331.836066 | 56.288587 | 54.859762 | 2191404.68254 | 60.705016 | 59.619875 | 1624116.5 | 60.925302 | 59.595159 | 1622168.206349 | 66.694918 | 65.303443 | 1631151.639344 | 74.717937 | 73.624524 | 1419948.698413 | 72.898438 | 71.173375 | 1800375.140625 | 62.297746 | 60.930667 | 2270051.634921 | 51.614267 | 49.998952 | 2834324.114754 | 53.284647 | 51.860842 | 1531503.703125 | 54.033789 | 53.00923 | 1895174.203125 | 64.308589 | 62.977471 | 2097015.238095 | 66.29605 | 65.184232 | 1916476.16129 | 53.296676 | 52.291944 | 2627907.0 | 48.271071 | 47.413467 | 1919155.206349 | 52.275247 | 51.428977 | 2122564.7 |
| 0 | ALB | 59.422 | 58.35975 | 738925.166667 | 58.865891 | 57.647281 | 798707.59375 | 59.128953 | 58.218312 | 804109.703125 | 62.520635 | 61.596651 | 662333.936508 | 61.266967 | 60.274475 | 631124.557377 | 65.074079 | 64.164619 | 533168.222222 | 62.160953 | 60.983437 | 1437183.359375 | 56.594016 | 55.267 | 1462997.079365 | 52.807508 | 51.484361 | 2007614.065574 | 57.841937 | 56.750413 | 1095955.634921 | 47.790344 | 46.420312 | 1471876.21875 | 50.553 | 49.121714 | 1480702.650794 | 54.459115 | 52.543164 | 1433544.245902 | 72.837594 | 71.230234 | 1899285.328125 | 81.901984 | 80.286344 | 1283941.09375 | 84.576016 | 82.875873 | 888396.555556 | 97.426516 | 95.477742 | 1036087.532258 | 109.16 | 107.409206 | 1023728.301587 | 119.617524 | 117.308175 | 1430350.84127 | 140.3418 | 137.821167 | 1259730.2 |
| 0 | ARE | 61.36855 | 60.743033 | 510877.366667 | 61.040516 | 59.928797 | 812680.375 | 57.557531 | 56.515922 | 497802.3125 | 56.650143 | 55.566413 | 487156.761905 | 62.462639 | 61.485738 | 523381.442623 | 67.283714 | 66.416952 | 380025.349206 | 70.467438 | 69.669313 | 459933.390625 | 76.38346 | 75.279413 | 490275.349206 | 89.050443 | 87.467197 | 471547.639344 | 86.529873 | 85.126968 | 399307.920635 | 84.633359 | 83.013094 | 364752.734375 | 85.659873 | 84.143825 | 468813.809524 | 77.99 | 76.298508 | 515845.04918 | 91.893234 | 90.337047 | 623097.203125 | 105.790109 | 103.950516 | 587047.640625 | 105.292381 | 103.161841 | 609395.222222 | 111.310645 | 109.527903 | 1028053.66129 | 115.439048 | 113.911905 | 727220.412698 | 120.099048 | 118.52127 | 589167.52381 | 123.519667 | 122.114167 | 492825.366667 |
| 0 | ALGN | 31.159017 | 30.377733 | 965606.483333 | 34.975797 | 33.909 | 932521.21875 | 44.089359 | 43.058906 | 623338.703125 | 54.480365 | 53.042889 | 1102201.47619 | 56.988443 | 55.11541 | 1269293.868852 | 52.782825 | 51.130048 | 1145288.68254 | 55.01075 | 53.947656 | 664095.84375 | 53.482111 | 52.269952 | 715800.730159 | 57.678639 | 56.164098 | 932696.57377 | 59.708095 | 58.407381 | 876989.904762 | 60.373438 | 58.926797 | 730518.59375 | 64.794143 | 63.197921 | 649318.666667 | 66.427911 | 64.275434 | 654232.180328 | 77.125 | 75.716822 | 578877.9375 | 91.181948 | 89.562223 | 596522.5 | 94.061211 | 91.813589 | 876983.809524 | 101.910419 | 99.875898 | 787098.354839 | 137.06419 | 134.170984 | 1255167.301587 | 172.816968 | 169.111603 | 868528.095238 | 213.796 | 208.346733 | 1115914.7 |
| 0 | ALLE | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 42.559722 | 41.557444 | 2326585.333333 | 49.590262 | 48.319852 | 1270741.557377 | 51.807556 | 50.721571 | 705595.587302 | 51.752297 | 50.791828 | 587033.5625 | 51.206841 | 50.206476 | 723700.015873 | 56.497984 | 55.437393 | 762305.081967 | 60.683825 | 59.811143 | 564986.84127 | 60.533687 | 59.331172 | 562724.65625 | 63.634683 | 62.399492 | 778905.380952 | 61.081 | 59.526738 | 912952.032787 | 66.168641 | 65.057437 | 652007.0 | 70.223922 | 69.284297 | 588783.5 | 66.34781 | 65.213746 | 800261.730159 | 70.690645 | 69.571823 | 775755.725806 | 78.576492 | 77.469111 | 747584.571429 | 81.224873 | 80.155159 | 517902.968254 | 86.157333 | 84.726927 | 796675.4 |
| 0 | LNT | 19.833667 | 19.630533 | 1096029.6 | 21.7005 | 21.357781 | 1031537.703125 | 22.163375 | 21.863391 | 1133217.8125 | 22.626413 | 22.287175 | 1118667.571429 | 23.384557 | 23.072443 | 1122770.180328 | 25.851143 | 25.529857 | 1148605.238095 | 25.994484 | 25.654 | 1289851.59375 | 28.091159 | 27.61719 | 1222716.936508 | 29.901721 | 29.356951 | 1519714.639344 | 28.033143 | 27.640937 | 1334767.587302 | 27.474594 | 27.022156 | 1400986.171875 | 28.269857 | 27.814222 | 1324544.444444 | 32.027246 | 31.455148 | 1505386.147541 | 35.556656 | 34.981984 | 2127155.546875 | 37.931031 | 37.304672 | 1506166.765625 | 35.969651 | 35.375222 | 1294674.31746 | 37.607484 | 37.134 | 1213372.370968 | 39.966476 | 39.541794 | 1381354.412698 | 41.404048 | 40.971587 | 1011096.301587 | 43.097533 | 42.592293 | 1377366.533333 |
| 0 | ALL | 41.775167 | 41.2432 | 3434803.983333 | 45.130828 | 44.478172 | 3923570.75 | 46.851547 | 46.229 | 2952061.890625 | 49.845937 | 49.225032 | 2502155.619048 | 50.444 | 49.757066 | 3070192.934426 | 54.400365 | 53.872794 | 2482041.920635 | 56.904359 | 56.361063 | 2327236.03125 | 62.651508 | 61.936 | 2418033.063492 | 67.630475 | 66.660803 | 2569832.672131 | 65.974413 | 65.215794 | 2616602.15873 | 60.50425 | 59.549359 | 3279574.0625 | 60.445508 | 59.56127 | 2817078.380952 | 61.341361 | 60.395262 | 2948406.852459 | 65.640938 | 64.841594 | 1997076.203125 | 67.744703 | 67.177312 | 1629849.046875 | 69.673952 | 68.879175 | 2349906.809524 | 78.158903 | 77.394903 | 1819662.048387 | 84.550587 | 83.749984 | 1847252.634921 | 91.21501 | 90.157444 | 1960922.936508 | 94.74633 | 93.664003 | 1809744.966667 |
| 0 | GOOGL | 391.94 | 386.54 | 4710365.566667 | 427.770781 | 420.828594 | 4744065.375 | 446.389219 | 440.9575 | 3683857.9375 | 510.070476 | 503.04619 | 3619180.253968 | 589.497869 | 580.325738 | 4558310.721311 | 557.104603 | 545.903333 | 2446307.904762 | 593.085625 | 585.090781 | 1612151.828125 | 551.79746 | 542.204127 | 2198303.619048 | 546.473607 | 536.552951 | 2368245.360656 | 553.551905 | 545.467778 | 1767785.650794 | 651.789531 | 636.782187 | 2616696.96875 | 745.732619 | 731.616508 | 2054101.825397 | 745.071836 | 727.587508 | 2133765.016393 | 739.610859 | 728.390531 | 1371835.65625 | 785.743031 | 777.097047 | 1061037.15625 | 806.646873 | 793.98481 | 1732969.52381 | 844.654742 | 835.665145 | 1500393.129032 | 941.14981 | 928.679 | 1699425.68254 | 954.277016 | 941.745508 | 1649185.634921 | 1018.2223 | 1004.470233 | 1591438.266667 |
| 0 | MO | 27.804733 | 27.513033 | 13021175.25 | 30.020234 | 29.596156 | 11530296.125 | 29.783438 | 29.420891 | 10201472.640625 | 31.517444 | 31.145762 | 8860092.920635 | 31.564984 | 31.140131 | 10249242.163934 | 35.299429 | 34.885984 | 8557514.396825 | 38.049359 | 37.613687 | 7041493.1875 | 43.771159 | 43.163921 | 7621816.238095 | 48.308754 | 47.592033 | 8041661.639344 | 46.593317 | 45.944587 | 7200026.952381 | 49.801453 | 48.957937 | 7766711.328125 | 54.317063 | 53.491048 | 6346995.714286 | 57.344525 | 56.343689 | 7185249.213115 | 61.154234 | 60.290312 | 6438985.9375 | 64.032469 | 63.298719 | 5463859.625 | 62.576143 | 61.735317 | 6994040.809524 | 70.895935 | 70.206661 | 6767171.483871 | 72.274222 | 71.559794 | 5800049.904762 | 66.655587 | 65.611365 | 8136257.253968 | 64.755757 | 63.85896 | 6578395.933333 |
| 0 | AMZN | 269.118 | 262.854 | 3522848.05 | 269.194062 | 263.381406 | 3039878.9375 | 300.553125 | 294.989531 | 2420409.5625 | 362.279683 | 354.804127 | 2940043.174603 | 375.857541 | 366.867541 | 3931548.016393 | 320.385238 | 311.777143 | 4657967.349206 | 335.560938 | 329.103437 | 3587100.65625 | 315.103968 | 307.935556 | 4204031.571429 | 355.356557 | 347.714426 | 3923801.163934 | 421.807937 | 414.614286 | 3037333.206349 | 512.614375 | 500.030937 | 4089669.734375 | 635.442857 | 621.514841 | 4296253.888889 | 576.212148 | 557.810295 | 4989470.04918 | 681.847781 | 670.028109 | 3135022.640625 | 769.382875 | 760.123859 | 2378967.3125 | 791.033063 | 777.629952 | 3940060.904762 | 836.813048 | 827.159419 | 3134660.548387 | 961.248302 | 946.97246 | 3873340.206349 | 989.650429 | 975.274317 | 3407886.507937 | 1038.752167 | 1022.620467 | 3624009.266667 |
| 0 | AEE | 27.581883 | 27.2762 | 2239801.833333 | 29.653313 | 29.193766 | 2050389.765625 | 29.771781 | 29.333625 | 1813607.015625 | 31.188175 | 30.726413 | 1682456.365079 | 33.811361 | 33.330541 | 2031949.606557 | 35.396603 | 34.857952 | 2168062.507937 | 35.06275 | 34.561078 | 1501801.734375 | 38.459984 | 37.736508 | 1714479.365079 | 39.92759 | 39.136557 | 2078598.606557 | 37.14381 | 36.593 | 1494496.555556 | 37.614984 | 36.93275 | 1793303.921875 | 40.935873 | 40.251238 | 1859961.571429 | 43.962672 | 43.098574 | 2283694.459016 | 47.250312 | 46.486203 | 2023767.28125 | 49.458922 | 48.746828 | 1539681.015625 | 48.492365 | 47.693873 | 1554330.47619 | 52.6895 | 52.027306 | 1447328.193548 | 55.068048 | 54.443286 | 1583701.587302 | 57.753825 | 57.100922 | 1367012.222222 | 61.177167 | 60.499507 | 1271617.766667 |
| 0 | AAL | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 25.450929 | 24.744214 | 13773163.642857 | 33.337885 | 32.335541 | 11510576.688525 | 38.284619 | 37.01619 | 12882764.603175 | 38.769172 | 37.566609 | 10396599.125 | 42.478651 | 40.844746 | 16353052.047619 | 50.163951 | 48.608475 | 15114543.704918 | 45.286794 | 43.947 | 12909723.285714 | 40.863875 | 39.51525 | 11103670.140625 | 42.985556 | 41.79873 | 8641494.095238 | 40.070148 | 38.767656 | 9220325.590164 | 33.859359 | 32.891188 | 9850499.3125 | 35.320672 | 34.3435 | 8649792.484375 | 43.729714 | 42.570048 | 6898510.587302 | 45.53771 | 44.359613 | 6820026.177419 | 46.744032 | 45.639381 | 6349688.460317 | 48.805619 | 47.672143 | 5820506.15873 | 50.257267 | 49.0754 | 4891022.533333 |
| 0 | AEP | 38.16775 | 37.795517 | 3277158.716667 | 40.749 | 40.151016 | 3817119.09375 | 38.155797 | 37.600766 | 3865680.796875 | 39.953571 | 39.392444 | 4153897.253968 | 42.539951 | 41.96441 | 3284502.04918 | 46.640651 | 45.978524 | 3352281.825397 | 47.244516 | 46.573125 | 2618190.4375 | 51.561397 | 50.680635 | 3023673.285714 | 54.140082 | 53.113279 | 3176102.278689 | 50.828159 | 50.138587 | 2712356.873016 | 51.449766 | 50.550516 | 3062072.65625 | 53.019968 | 52.143 | 3349036.365079 | 58.439508 | 57.327066 | 3541953.245902 | 62.276391 | 61.355812 | 2758869.453125 | 64.582781 | 63.727406 | 2528305.421875 | 59.796524 | 58.856286 | 2880379.142857 | 63.224661 | 62.524597 | 2422669.870968 | 68.402206 | 67.732016 | 2461902.587302 | 70.743556 | 70.028413 | 2350829.126984 | 73.047233 | 72.237 | 2089546.366667 |
| 0 | AXP | 58.424617 | 57.642217 | 5916452.983333 | 67.144656 | 66.072609 | 6094520.890625 | 71.165656 | 70.142172 | 4506820.21875 | 77.688333 | 76.752143 | 3872488.825397 | 85.156115 | 83.758525 | 4386631.245902 | 85.84054 | 84.663698 | 3528490.063492 | 86.043828 | 85.038156 | 3900152.140625 | 85.935381 | 84.675778 | 4373285.539683 | 80.077197 | 78.707738 | 8398035.081967 | 76.511556 | 75.58981 | 6278623.126984 | 74.554031 | 73.371109 | 6229667.703125 | 70.75154 | 69.766095 | 5752311.52381 | 56.939426 | 55.732574 | 8905854.016393 | 62.205594 | 61.188281 | 5104626.359375 | 63.319953 | 62.471031 | 4194254.015625 | 68.611254 | 67.594175 | 4694169.079365 | 77.76121 | 76.76821 | 4369751.467742 | 78.962667 | 78.062667 | 3541468.888889 | 86.027794 | 85.152111 | 3221804.190476 | 93.794167 | 92.721347 | 3273773.8 |
| 0 | AIG | 35.63535 | 34.937367 | 18662677.1 | 40.712 | 39.644031 | 18893028.9375 | 44.973734 | 44.233453 | 10869456.828125 | 47.204571 | 46.434095 | 10460289.984127 | 47.300574 | 46.446148 | 10389889.934426 | 50.300889 | 49.538619 | 8359558.063492 | 52.024625 | 51.378281 | 7492150.28125 | 51.222714 | 50.449048 | 9040392.380952 | 51.376623 | 50.522656 | 9553777.983607 | 56.659952 | 55.859921 | 8016945.84127 | 58.858641 | 57.882594 | 8173436.71875 | 58.965571 | 58.01373 | 9109713.015873 | 52.664508 | 51.560426 | 10310924.180328 | 53.940625 | 53.150031 | 7151242.59375 | 56.055703 | 55.406781 | 5828523.03125 | 61.984254 | 61.160714 | 6180682.111111 | 63.744532 | 62.847677 | 6946704.741935 | 62.036651 | 61.279222 | 6627421.428571 | 62.432125 | 61.689286 | 4870472.984127 | 63.60202 | 62.785417 | 5001191.566667 |
| 0 | AMT | 71.171967 | 70.203417 | 2681577.35 | 74.151641 | 72.700094 | 2828155.21875 | 67.497469 | 66.230937 | 4176361.875 | 72.705429 | 71.528095 | 2493615.31746 | 76.560115 | 75.372115 | 2451125.47541 | 81.45619 | 80.470778 | 1942582.888889 | 90.019438 | 88.932203 | 1950203.6875 | 93.363698 | 92.054143 | 2159107.634921 | 92.929443 | 91.358689 | 3063168.229508 | 90.779127 | 89.418984 | 2496382.952381 | 90.769953 | 89.317141 | 2429212.671875 | 94.292524 | 92.840778 | 2010147.253968 | 91.148311 | 89.217967 | 2612946.459016 | 103.817969 | 102.446625 | 1833266.359375 | 112.187187 | 110.758594 | 1892188.5625 | 107.569206 | 105.636381 | 2785039.603175 | 109.017419 | 107.594516 | 2162754.129032 | 127.789841 | 126.001587 | 2019509.253968 | 139.51873 | 137.524381 | 1779885.222222 | 142.252067 | 139.603367 | 2029483.5 |
| 0 | AWK | 35.237433 | 34.778283 | 1025179.866667 | 37.365094 | 36.784578 | 992241.484375 | 37.703328 | 37.110172 | 913628.921875 | 38.442111 | 37.875413 | 774682.396825 | 39.933492 | 39.397492 | 809006.459016 | 43.639984 | 43.091905 | 810337.079365 | 45.719391 | 45.128344 | 622071.796875 | 48.767746 | 47.991571 | 787525.825397 | 51.57123 | 50.692295 | 945761.213115 | 50.208984 | 49.49719 | 928097.31746 | 50.117781 | 49.294469 | 916592.296875 | 55.614905 | 54.709651 | 921576.619048 | 63.22118 | 62.047738 | 2237006.491803 | 72.439109 | 71.299516 | 1342465.15625 | 76.716813 | 75.599984 | 1249975.6875 | 71.402063 | 70.18054 | 1046900.952381 | 73.440645 | 72.457113 | 942151.951613 | 78.155667 | 77.214238 | 889762.333333 | 80.783873 | 79.917619 | 749040.920635 | 86.3166 | 85.294267 | 863855.366667 |
| 0 | AMP | 61.237867 | 60.3713 | 1646950.916667 | 70.256109 | 68.886281 | 1516202.953125 | 80.160359 | 78.970172 | 1108717.375 | 93.929937 | 92.585571 | 1151260.777778 | 100.700115 | 98.922295 | 1392152.0 | 103.056683 | 101.413302 | 1117441.873016 | 113.578438 | 112.099375 | 862742.515625 | 117.761111 | 115.726365 | 983218.507937 | 123.361639 | 121.201967 | 1069675.52459 | 119.874762 | 118.00873 | 1595580.619048 | 111.807031 | 109.493828 | 1358759.46875 | 106.46346 | 104.32081 | 1108963.15873 | 86.277902 | 83.776607 | 1847693.655738 | 93.391797 | 91.510938 | 1325956.453125 | 94.218453 | 92.751344 | 900529.609375 | 103.961698 | 101.794968 | 1222438.666667 | 122.39371 | 120.047097 | 1158429.677419 | 126.434286 | 124.307619 | 1094924.507937 | 139.690794 | 137.466032 | 1047268.111111 | 155.4065 | 153.385 | 777338.433333 |
| 0 | AME | 40.1458 | 39.5763 | 890057.9 | 40.912266 | 40.243781 | 1132761.3125 | 44.16825 | 43.51375 | 1679207.875 | 47.487794 | 46.641397 | 1077237.047619 | 50.919082 | 50.149918 | 893441.327869 | 51.665873 | 50.94927 | 819664.460317 | 51.114672 | 50.49175 | 920995.015625 | 50.130683 | 49.184952 | 1456005.825397 | 50.591525 | 49.795426 | 1205173.131148 | 53.021397 | 52.373635 | 1035037.888889 | 53.579391 | 52.651234 | 1434127.71875 | 54.565159 | 53.666889 | 1389681.47619 | 47.504541 | 46.49777 | 1716285.622951 | 47.942813 | 47.254188 | 1538806.34375 | 47.539391 | 46.913469 | 1536147.125 | 47.530635 | 46.773587 | 2382602.873016 | 52.837903 | 52.102081 | 1744188.596774 | 59.134095 | 58.403714 | 1455446.365079 | 63.455905 | 62.755048 | 903732.825397 | 68.162663 | 67.344667 | 1160574.3 |
Unemployment and GDP Excel¶
Unemployment and GDP Excel Finally, quarterly GDP and unemployment are scraped from an excel file found at https://www.cbo.gov/system/files/2019-04/51137-2017-06-potentialgdp-2.xlsx.We took this data set because it focused on the 2 economic indicators which we wanted to focus on. We chose GDP and unemployment because they are indicators which will affect a majority of the GICS Sectors (industries) that our data was grouped into, however, their effects on it are not as obvious as something like interest rates. This data set showed us the Real GDP, Nominal GDP, Unemployment Underlying Long Term, and Unemployment Natural (defined below):
1. Real GDP: Measures the actual growth of production without any distorting effects from inflation.
2. Nominal GDP: Total value of all goods and services produced in a give time period, usually quarterly or annually.
3. Unemployment Underlying Long Term: Refers to the number of previously employed people in the U.S. who have been facing unemployment for at least 6 months.
4. Unemployment Natural: Refers to the number of previously employed people in the U.S. who are currently facing unemployment.
Link to Nominal vs. Real GDP: https://www.investopedia.com/ask/answers/030515/when-do-economists-use-real-gdp-instead-gdp.asp#:~:text=Key%20Takeaways,any%20distorting%20effects%20from%20inflation.
#gdp and unemployment excel file
#starts extracting data at 2013Q1(252 rows down)
gdp_unemp_df = pd.read_excel("gdp_unemployment_data.xlsx", sheet_name='3. GDP and Unemployment', skiprows=252)
#removes empty columns
gdp_unemp_df = gdp_unemp_df.iloc[:, [0,1,2,4,5]]
#renames columns
gdp_unemp_df.columns = ['Quarter', 'Real GDP', 'Nominal GDP', 'Unemployment Underlying Long Term', 'Unemployment Natural']
#drops all rows that have NaN as Nominal GDP
gdp_unemp_df = gdp_unemp_df.dropna(subset=['Nominal GDP'])
# Filter to include only rows up to 2017Q4
gdp_unemp_df = gdp_unemp_df[(gdp_unemp_df['Quarter'] >= '2013Q1') & (gdp_unemp_df['Quarter'] <= '2017Q4')]
gdp_unemp_df.set_index("Quarter", inplace=True)
# print(gdp_unemp_df.to_string(index=False))
gdp_unemp_df.head(20)
| Real GDP | Nominal GDP | Unemployment Underlying Long Term | Unemployment Natural | |
|---|---|---|---|---|
| Quarter | ||||
| 2013Q1 | 15957.4 | 16965.6 | 5.048 | 5.540 |
| 2013Q2 | 16017.8 | 17069.4 | 5.025 | 5.395 |
| 2013Q3 | 16079.7 | 17223.3 | 5.002 | 5.292 |
| 2013Q4 | 16142.8 | 17381.6 | 4.976 | 5.176 |
| 2014Q1 | 16206.5 | 17525.7 | 4.953 | 5.073 |
| 2014Q2 | 16272.1 | 17690.1 | 4.930 | 4.990 |
| 2014Q3 | 16338.6 | 17838.2 | 4.915 | 4.935 |
| 2014Q4 | 16405.7 | 17934.9 | 4.879 | 4.879 |
| 2015Q1 | 16473.1 | 18006.3 | 4.830 | 4.830 |
| 2015Q2 | 16540.2 | 18181.3 | 4.785 | 4.785 |
| 2015Q3 | 16607.0 | 18312.2 | 4.765 | 4.765 |
| 2015Q4 | 16673.3 | 18423.7 | 4.755 | 4.755 |
| 2016Q1 | 16738.0 | 18518.1 | 4.749 | 4.749 |
| 2016Q2 | 16801.3 | 18694.4 | 4.745 | 4.745 |
| 2016Q3 | 16864.3 | 18831.0 | 4.743 | 4.743 |
| 2016Q4 | 16927.7 | 18999.3 | 4.742 | 4.742 |
| 2017Q1 | 16992.2 | 19178.8 | 4.740 | 4.740 |
| 2017Q2 | 17058.2 | 19280.7 | 4.738 | 4.738 |
| 2017Q3 | 17125.5 | 19468.0 | 4.737 | 4.737 |
| 2017Q4 | 17194.2 | 19647.5 | 4.735 | 4.735 |
Data Processing (Cont'd)
Next, in order to eventually visualize and analyze the data we must merge the wikipedia information (ticker, security, and sector) with the newly formed kaggle stock data (quarterly high and low values between 2013-2017).
#merging the stock dataframe from wiki and the kaggle stock data frame.
#Merge on symbol so we can work with 1 dataframe
stocks_df = pd.merge(stocks_df, kaggle_df, on="Symbol")
stocks_df.head(20)
| Symbol | Security | GICS Sector | 2013 Q1 Mean High | 2013 Q1 Mean Low | 2013 Q1 Mean Volume | 2013 Q2 Mean High | 2013 Q2 Mean Low | 2013 Q2 Mean Volume | 2013 Q3 Mean High | 2013 Q3 Mean Low | 2013 Q3 Mean Volume | 2013 Q4 Mean High | 2013 Q4 Mean Low | 2013 Q4 Mean Volume | 2014 Q1 Mean High | 2014 Q1 Mean Low | 2014 Q1 Mean Volume | 2014 Q2 Mean High | 2014 Q2 Mean Low | 2014 Q2 Mean Volume | 2014 Q3 Mean High | 2014 Q3 Mean Low | 2014 Q3 Mean Volume | 2014 Q4 Mean High | 2014 Q4 Mean Low | 2014 Q4 Mean Volume | 2015 Q1 Mean High | 2015 Q1 Mean Low | 2015 Q1 Mean Volume | 2015 Q2 Mean High | 2015 Q2 Mean Low | 2015 Q2 Mean Volume | 2015 Q3 Mean High | 2015 Q3 Mean Low | 2015 Q3 Mean Volume | 2015 Q4 Mean High | 2015 Q4 Mean Low | 2015 Q4 Mean Volume | 2016 Q1 Mean High | 2016 Q1 Mean Low | 2016 Q1 Mean Volume | 2016 Q2 Mean High | 2016 Q2 Mean Low | 2016 Q2 Mean Volume | 2016 Q3 Mean High | 2016 Q3 Mean Low | 2016 Q3 Mean Volume | 2016 Q4 Mean High | 2016 Q4 Mean Low | 2016 Q4 Mean Volume | 2017 Q1 Mean High | 2017 Q1 Mean Low | 2017 Q1 Mean Volume | 2017 Q2 Mean High | 2017 Q2 Mean Low | 2017 Q2 Mean Volume | 2017 Q3 Mean High | 2017 Q3 Mean Low | 2017 Q3 Mean Volume | 2017 Q4 Mean High | 2017 Q4 Mean Low | 2017 Q4 Mean Volume | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | MMM | 3M | Industrials | 91.312167 | 90.39165 | 3000908.633333 | 98.020156 | 96.832453 | 3184629.796875 | 105.455313 | 104.375 | 2593571.890625 | 115.978413 | 114.62381 | 3003412.984127 | 122.193607 | 120.529344 | 3286431.163934 | 129.457937 | 128.161111 | 2362018.47619 | 133.417656 | 132.191094 | 2302554.203125 | 143.35873 | 141.33619 | 3041470.253968 | 155.016885 | 152.888197 | 2458525.803279 | 151.794762 | 150.024921 | 2440807.730159 | 140.275156 | 138.233281 | 2956910.703125 | 147.416667 | 145.417778 | 2596462.174603 | 148.371311 | 146.186393 | 2821154.639344 | 163.714844 | 161.959375 | 1975933.640625 | 174.425313 | 172.915781 | 1730216.75 | 169.834127 | 167.954603 | 2021322.095238 | 181.455806 | 179.752903 | 1989174.451613 | 199.043492 | 197.367778 | 1849468.174603 | 207.525206 | 205.418159 | 1684750.68254 | 225.0516 | 222.543667 | 1925851.566667 |
| 1 | AOS | A. O. Smith | Industrials | 16.693183 | 16.394633 | 1580067.933333 | 17.946859 | 17.548625 | 1883052.6875 | 20.077312 | 19.714828 | 1230033.8125 | 24.618 | 24.195079 | 1645559.492063 | 23.568213 | 23.127361 | 1678852.622951 | 23.258302 | 22.849127 | 1349398.031746 | 23.570563 | 23.206109 | 1015275.21875 | 25.370365 | 24.94981 | 999006.349206 | 29.686967 | 29.156984 | 1118223.540984 | 33.707444 | 33.210063 | 1020983.460317 | 33.785922 | 32.879422 | 1674719.625 | 37.247635 | 36.402302 | 1528875.650794 | 34.733787 | 33.713066 | 1840718.47541 | 40.014906 | 39.273094 | 1472523.25 | 46.386844 | 45.709766 | 1067305.671875 | 48.150698 | 47.271032 | 1280703.47619 | 49.546 | 48.752032 | 919397.596774 | 54.049048 | 53.272238 | 869093.190476 | 56.258095 | 55.529968 | 1342885.873016 | 60.550513 | 59.777517 | 862284.733333 |
| 2 | ABT | Abbott | Health Care | 30.724817 | 30.309883 | 12451680.116667 | 33.482594 | 32.897703 | 7951001.546875 | 32.113391 | 31.673766 | 8005789.8125 | 33.834984 | 33.331286 | 7459989.047619 | 35.685213 | 35.102541 | 9174882.016393 | 36.46273 | 35.985317 | 6591339.587302 | 39.491984 | 39.013391 | 5035284.0 | 40.963206 | 40.289476 | 5394024.698413 | 43.385066 | 42.705344 | 5984359.327869 | 45.528857 | 44.942651 | 4981663.079365 | 45.127313 | 44.260953 | 7043142.28125 | 42.363111 | 41.620952 | 6665399.507937 | 38.360328 | 37.510508 | 8101788.295082 | 38.6665 | 38.062969 | 10643153.84375 | 41.879109 | 41.256594 | 9773374.203125 | 39.127889 | 38.410063 | 8882569.047619 | 42.683161 | 42.123548 | 8615529.145161 | 45.023984 | 44.466397 | 6880527.587302 | 50.187952 | 49.569937 | 5433541.793651 | 55.202797 | 54.498413 | 5491270.8 |
| 3 | ABBV | AbbVie | Health Care | 31.711379 | 30.96831 | 10897951.293103 | 37.874344 | 36.874609 | 7384147.078125 | 38.554094 | 37.840172 | 4690873.125 | 43.141349 | 42.268556 | 6319228.111111 | 44.859967 | 43.921311 | 7609473.032787 | 46.878524 | 45.901365 | 6642542.222222 | 49.877344 | 48.994797 | 13670690.390625 | 57.269476 | 55.990286 | 13088959.507937 | 55.40041 | 54.023197 | 13034415.47541 | 60.105444 | 59.026444 | 12052673.666667 | 60.615578 | 59.162406 | 10942893.65625 | 54.219746 | 52.613365 | 9892407.507937 | 53.032361 | 51.245393 | 9461727.672131 | 58.174437 | 56.981422 | 9176243.015625 | 62.272141 | 61.371672 | 7220892.84375 | 59.454603 | 58.236333 | 7871293.349206 | 61.587597 | 60.726435 | 7046768.645161 | 66.49673 | 65.606238 | 5782878.920635 | 76.087587 | 74.74454 | 6246636.222222 | 93.05493 | 90.88035 | 6171383.966667 |
| 4 | ACN | Accenture | Information Technology | 65.413133 | 64.46965 | 2958136.65 | 71.669828 | 70.472016 | 3545633.890625 | 66.856953 | 65.881375 | 3753502.28125 | 68.918381 | 67.920349 | 3565505.31746 | 75.105098 | 73.927754 | 3385196.557377 | 74.092238 | 73.077921 | 3025692.761905 | 74.035734 | 73.190891 | 2751022.140625 | 77.558571 | 76.471746 | 3074781.619048 | 83.717639 | 82.393557 | 2900945.442623 | 90.404254 | 89.257111 | 2328701.730159 | 94.127406 | 92.481 | 2749655.25 | 100.857413 | 99.433635 | 2597977.174603 | 98.467754 | 96.567508 | 3563175.213115 | 111.829062 | 110.387188 | 2252809.296875 | 110.0425 | 108.5325 | 2355634.875 | 116.823968 | 115.13 | 2508397.920635 | 117.692258 | 116.198548 | 2577157.33871 | 121.100476 | 119.760794 | 2231790.698413 | 129.643016 | 128.244762 | 1905023.0 | 140.0467 | 138.514 | 1750601.6 |
| 5 | ADBE | Adobe Inc. | Information Technology | 39.7829 | 39.11015 | 3569916.466667 | 44.708719 | 43.920484 | 3709883.078125 | 48.072375 | 47.381406 | 3170725.46875 | 55.64381 | 54.627508 | 3985544.47619 | 64.792295 | 63.390164 | 3924305.557377 | 65.134952 | 63.753222 | 3614820.587302 | 71.433422 | 70.293672 | 2636903.1875 | 70.770476 | 69.400349 | 3332754.111111 | 75.196279 | 73.791721 | 2366642.131148 | 78.606508 | 77.328 | 2462360.793651 | 81.957266 | 80.139375 | 3419304.015625 | 90.60054 | 88.735 | 2952103.492063 | 87.827552 | 85.323174 | 3681589.95082 | 96.7314 | 95.154775 | 2111794.109375 | 100.877748 | 99.587905 | 1807114.6875 | 106.782841 | 105.022873 | 2293033.888889 | 117.874855 | 116.368194 | 2306142.419355 | 137.783524 | 135.854397 | 2570180.857143 | 150.173397 | 148.012063 | 1972602.984127 | 166.874633 | 163.9689 | 3072533.633333 |
| 6 | AMD | Advanced Micro Devices | Information Technology | 2.672167 | 2.564667 | 18729917.45 | 3.521875 | 3.371563 | 28015006.546875 | 3.874219 | 3.744844 | 29267395.953125 | 3.633492 | 3.531429 | 24142526.84127 | 3.882131 | 3.758033 | 25269923.836066 | 4.116508 | 3.99381 | 21170687.333333 | 4.112656 | 3.992812 | 29876772.28125 | 2.809206 | 2.703492 | 18739877.619048 | 2.868279 | 2.74918 | 17290156.918033 | 2.463016 | 2.385556 | 14835027.238095 | 1.936641 | 1.833281 | 13248920.109375 | 2.268175 | 2.167619 | 9464912.730159 | 2.35041 | 2.198033 | 14099698.442623 | 3.985508 | 3.764922 | 22684549.421875 | 6.523773 | 6.237422 | 35252396.84375 | 8.582698 | 8.168543 | 46450313.365079 | 12.959565 | 12.396373 | 64543556.741935 | 12.506833 | 11.939441 | 76875665.444444 | 13.323238 | 12.81497 | 64954818.825397 | 12.990133 | 12.568997 | 61987566.333333 |
| 7 | AES | AES Corporation | Utilities | 10.10842 | 9.914168 | 7043960.75 | 11.204391 | 10.953047 | 6695538.890625 | 11.321672 | 11.107187 | 4528306.390625 | 12.618524 | 12.345254 | 6451758.952381 | 12.553852 | 12.295115 | 6094453.245902 | 12.804984 | 12.569683 | 6143209.666667 | 13.290375 | 13.076266 | 4156436.75 | 12.298159 | 11.993286 | 6739359.666667 | 11.363377 | 11.076623 | 7149936.786885 | 12.214016 | 11.966571 | 6735586.222222 | 11.136202 | 10.871042 | 6675684.125 | 9.324605 | 9.074429 | 6452733.015873 | 9.368844 | 9.050907 | 6921824.786885 | 10.665641 | 10.432906 | 5180647.8125 | 11.885609 | 11.656453 | 4655622.859375 | 11.352571 | 11.085476 | 5835446.761905 | 11.157871 | 10.906742 | 5824308.983871 | 11.298889 | 11.09727 | 5616984.539683 | 11.091317 | 10.909143 | 4938647.380952 | 10.9345 | 10.755767 | 5099342.833333 |
| 8 | AFL | Aflac | Financials | 46.2141 | 45.433267 | 4104190.6 | 49.176641 | 48.289062 | 3435438.578125 | 54.800359 | 54.012328 | 2189281.828125 | 59.970825 | 59.241762 | 2141384.761905 | 58.619787 | 57.822098 | 2175752.688525 | 57.771492 | 57.017746 | 1926278.460317 | 56.681 | 56.033297 | 2120122.046875 | 55.331175 | 54.628984 | 2444321.063492 | 57.613 | 56.80082 | 2244606.655738 | 60.131063 | 59.417365 | 2073244.571429 | 57.887156 | 56.950672 | 2387365.484375 | 59.969111 | 59.055286 | 2358668.0 | 57.645623 | 56.610869 | 2563169.836066 | 66.453531 | 65.644219 | 2310928.90625 | 71.264313 | 70.539875 | 1739223.609375 | 69.491175 | 68.60873 | 2020415.47619 | 70.262032 | 69.446823 | 2214819.548387 | 74.90873 | 74.134111 | 1846555.492063 | 80.695032 | 79.83694 | 1541004.079365 | 84.0718 | 83.197033 | 1440761.533333 |
| 9 | A | Agilent Technologies | Health Care | 29.667933 | 29.165533 | 5100460.5 | 30.035625 | 29.418578 | 5595136.578125 | 32.569641 | 32.047453 | 3832068.15625 | 36.757286 | 36.169667 | 3353795.761905 | 40.050869 | 39.277967 | 4170020.704918 | 39.052714 | 38.360952 | 2721890.15873 | 39.777484 | 39.240562 | 2585435.890625 | 39.484873 | 38.748032 | 3063606.825397 | 39.669721 | 38.975672 | 2532460.409836 | 40.752206 | 40.151476 | 2448139.761905 | 37.349875 | 36.627813 | 2521089.796875 | 38.163619 | 37.420571 | 3004962.285714 | 37.845246 | 36.91459 | 2357458.967213 | 42.781375 | 42.130453 | 2219362.46875 | 46.398047 | 45.748719 | 1877364.125 | 45.384619 | 44.604825 | 1990839.174603 | 50.474952 | 49.724629 | 2029887.822581 | 57.100619 | 56.334349 | 2009349.047619 | 62.56181 | 61.779984 | 1643051.47619 | 67.413667 | 66.701367 | 1397107.4 |
| 10 | APD | Air Products and Chemicals | Materials | 77.760883 | 76.798833 | 1093138.95 | 81.289891 | 80.249953 | 1552260.046875 | 92.838203 | 91.196234 | 2046325.296875 | 98.905492 | 97.454413 | 1143625.253968 | 104.973115 | 103.321607 | 1455185.934426 | 110.916032 | 109.299365 | 1199132.857143 | 122.2025 | 120.556875 | 1628685.890625 | 127.274921 | 125.009206 | 1710348.936508 | 139.970164 | 137.682459 | 1108369.032787 | 138.894444 | 136.88254 | 1078376.428571 | 130.466094 | 127.980312 | 1500850.15625 | 129.770635 | 127.481429 | 1433344.746032 | 125.588852 | 122.712131 | 1905698.672131 | 138.716563 | 136.732812 | 1064761.28125 | 145.268594 | 143.489688 | 1379288.796875 | 137.335556 | 135.394921 | 1452917.984127 | 139.034194 | 137.274355 | 1545373.048387 | 141.140476 | 139.533016 | 1303113.492063 | 146.273016 | 144.856825 | 1410870.507937 | 156.523133 | 155.0267 | 926571.433333 |
| 11 | AKAM | Akamai | Information Technology | 38.345833 | 37.470117 | 3942939.716667 | 42.344391 | 41.340719 | 2741468.78125 | 47.475531 | 46.609094 | 1724353.609375 | 47.71781 | 46.69873 | 2251961.84127 | 55.955328 | 54.503295 | 2719331.836066 | 56.288587 | 54.859762 | 2191404.68254 | 60.705016 | 59.619875 | 1624116.5 | 60.925302 | 59.595159 | 1622168.206349 | 66.694918 | 65.303443 | 1631151.639344 | 74.717937 | 73.624524 | 1419948.698413 | 72.898438 | 71.173375 | 1800375.140625 | 62.297746 | 60.930667 | 2270051.634921 | 51.614267 | 49.998952 | 2834324.114754 | 53.284647 | 51.860842 | 1531503.703125 | 54.033789 | 53.00923 | 1895174.203125 | 64.308589 | 62.977471 | 2097015.238095 | 66.29605 | 65.184232 | 1916476.16129 | 53.296676 | 52.291944 | 2627907.0 | 48.271071 | 47.413467 | 1919155.206349 | 52.275247 | 51.428977 | 2122564.7 |
| 12 | ALB | Albemarle Corporation | Materials | 59.422 | 58.35975 | 738925.166667 | 58.865891 | 57.647281 | 798707.59375 | 59.128953 | 58.218312 | 804109.703125 | 62.520635 | 61.596651 | 662333.936508 | 61.266967 | 60.274475 | 631124.557377 | 65.074079 | 64.164619 | 533168.222222 | 62.160953 | 60.983437 | 1437183.359375 | 56.594016 | 55.267 | 1462997.079365 | 52.807508 | 51.484361 | 2007614.065574 | 57.841937 | 56.750413 | 1095955.634921 | 47.790344 | 46.420312 | 1471876.21875 | 50.553 | 49.121714 | 1480702.650794 | 54.459115 | 52.543164 | 1433544.245902 | 72.837594 | 71.230234 | 1899285.328125 | 81.901984 | 80.286344 | 1283941.09375 | 84.576016 | 82.875873 | 888396.555556 | 97.426516 | 95.477742 | 1036087.532258 | 109.16 | 107.409206 | 1023728.301587 | 119.617524 | 117.308175 | 1430350.84127 | 140.3418 | 137.821167 | 1259730.2 |
| 13 | ARE | Alexandria Real Estate Equities | Real Estate | 61.36855 | 60.743033 | 510877.366667 | 61.040516 | 59.928797 | 812680.375 | 57.557531 | 56.515922 | 497802.3125 | 56.650143 | 55.566413 | 487156.761905 | 62.462639 | 61.485738 | 523381.442623 | 67.283714 | 66.416952 | 380025.349206 | 70.467438 | 69.669313 | 459933.390625 | 76.38346 | 75.279413 | 490275.349206 | 89.050443 | 87.467197 | 471547.639344 | 86.529873 | 85.126968 | 399307.920635 | 84.633359 | 83.013094 | 364752.734375 | 85.659873 | 84.143825 | 468813.809524 | 77.99 | 76.298508 | 515845.04918 | 91.893234 | 90.337047 | 623097.203125 | 105.790109 | 103.950516 | 587047.640625 | 105.292381 | 103.161841 | 609395.222222 | 111.310645 | 109.527903 | 1028053.66129 | 115.439048 | 113.911905 | 727220.412698 | 120.099048 | 118.52127 | 589167.52381 | 123.519667 | 122.114167 | 492825.366667 |
| 14 | ALGN | Align Technology | Health Care | 31.159017 | 30.377733 | 965606.483333 | 34.975797 | 33.909 | 932521.21875 | 44.089359 | 43.058906 | 623338.703125 | 54.480365 | 53.042889 | 1102201.47619 | 56.988443 | 55.11541 | 1269293.868852 | 52.782825 | 51.130048 | 1145288.68254 | 55.01075 | 53.947656 | 664095.84375 | 53.482111 | 52.269952 | 715800.730159 | 57.678639 | 56.164098 | 932696.57377 | 59.708095 | 58.407381 | 876989.904762 | 60.373438 | 58.926797 | 730518.59375 | 64.794143 | 63.197921 | 649318.666667 | 66.427911 | 64.275434 | 654232.180328 | 77.125 | 75.716822 | 578877.9375 | 91.181948 | 89.562223 | 596522.5 | 94.061211 | 91.813589 | 876983.809524 | 101.910419 | 99.875898 | 787098.354839 | 137.06419 | 134.170984 | 1255167.301587 | 172.816968 | 169.111603 | 868528.095238 | 213.796 | 208.346733 | 1115914.7 |
| 15 | ALLE | Allegion | Industrials | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 42.559722 | 41.557444 | 2326585.333333 | 49.590262 | 48.319852 | 1270741.557377 | 51.807556 | 50.721571 | 705595.587302 | 51.752297 | 50.791828 | 587033.5625 | 51.206841 | 50.206476 | 723700.015873 | 56.497984 | 55.437393 | 762305.081967 | 60.683825 | 59.811143 | 564986.84127 | 60.533687 | 59.331172 | 562724.65625 | 63.634683 | 62.399492 | 778905.380952 | 61.081 | 59.526738 | 912952.032787 | 66.168641 | 65.057437 | 652007.0 | 70.223922 | 69.284297 | 588783.5 | 66.34781 | 65.213746 | 800261.730159 | 70.690645 | 69.571823 | 775755.725806 | 78.576492 | 77.469111 | 747584.571429 | 81.224873 | 80.155159 | 517902.968254 | 86.157333 | 84.726927 | 796675.4 |
| 16 | LNT | Alliant Energy | Utilities | 19.833667 | 19.630533 | 1096029.6 | 21.7005 | 21.357781 | 1031537.703125 | 22.163375 | 21.863391 | 1133217.8125 | 22.626413 | 22.287175 | 1118667.571429 | 23.384557 | 23.072443 | 1122770.180328 | 25.851143 | 25.529857 | 1148605.238095 | 25.994484 | 25.654 | 1289851.59375 | 28.091159 | 27.61719 | 1222716.936508 | 29.901721 | 29.356951 | 1519714.639344 | 28.033143 | 27.640937 | 1334767.587302 | 27.474594 | 27.022156 | 1400986.171875 | 28.269857 | 27.814222 | 1324544.444444 | 32.027246 | 31.455148 | 1505386.147541 | 35.556656 | 34.981984 | 2127155.546875 | 37.931031 | 37.304672 | 1506166.765625 | 35.969651 | 35.375222 | 1294674.31746 | 37.607484 | 37.134 | 1213372.370968 | 39.966476 | 39.541794 | 1381354.412698 | 41.404048 | 40.971587 | 1011096.301587 | 43.097533 | 42.592293 | 1377366.533333 |
| 17 | ALL | Allstate | Financials | 41.775167 | 41.2432 | 3434803.983333 | 45.130828 | 44.478172 | 3923570.75 | 46.851547 | 46.229 | 2952061.890625 | 49.845937 | 49.225032 | 2502155.619048 | 50.444 | 49.757066 | 3070192.934426 | 54.400365 | 53.872794 | 2482041.920635 | 56.904359 | 56.361063 | 2327236.03125 | 62.651508 | 61.936 | 2418033.063492 | 67.630475 | 66.660803 | 2569832.672131 | 65.974413 | 65.215794 | 2616602.15873 | 60.50425 | 59.549359 | 3279574.0625 | 60.445508 | 59.56127 | 2817078.380952 | 61.341361 | 60.395262 | 2948406.852459 | 65.640938 | 64.841594 | 1997076.203125 | 67.744703 | 67.177312 | 1629849.046875 | 69.673952 | 68.879175 | 2349906.809524 | 78.158903 | 77.394903 | 1819662.048387 | 84.550587 | 83.749984 | 1847252.634921 | 91.21501 | 90.157444 | 1960922.936508 | 94.74633 | 93.664003 | 1809744.966667 |
| 18 | GOOGL | Alphabet Inc. (Class A) | Communication Services | 391.94 | 386.54 | 4710365.566667 | 427.770781 | 420.828594 | 4744065.375 | 446.389219 | 440.9575 | 3683857.9375 | 510.070476 | 503.04619 | 3619180.253968 | 589.497869 | 580.325738 | 4558310.721311 | 557.104603 | 545.903333 | 2446307.904762 | 593.085625 | 585.090781 | 1612151.828125 | 551.79746 | 542.204127 | 2198303.619048 | 546.473607 | 536.552951 | 2368245.360656 | 553.551905 | 545.467778 | 1767785.650794 | 651.789531 | 636.782187 | 2616696.96875 | 745.732619 | 731.616508 | 2054101.825397 | 745.071836 | 727.587508 | 2133765.016393 | 739.610859 | 728.390531 | 1371835.65625 | 785.743031 | 777.097047 | 1061037.15625 | 806.646873 | 793.98481 | 1732969.52381 | 844.654742 | 835.665145 | 1500393.129032 | 941.14981 | 928.679 | 1699425.68254 | 954.277016 | 941.745508 | 1649185.634921 | 1018.2223 | 1004.470233 | 1591438.266667 |
| 19 | MO | Altria | Consumer Staples | 27.804733 | 27.513033 | 13021175.25 | 30.020234 | 29.596156 | 11530296.125 | 29.783438 | 29.420891 | 10201472.640625 | 31.517444 | 31.145762 | 8860092.920635 | 31.564984 | 31.140131 | 10249242.163934 | 35.299429 | 34.885984 | 8557514.396825 | 38.049359 | 37.613687 | 7041493.1875 | 43.771159 | 43.163921 | 7621816.238095 | 48.308754 | 47.592033 | 8041661.639344 | 46.593317 | 45.944587 | 7200026.952381 | 49.801453 | 48.957937 | 7766711.328125 | 54.317063 | 53.491048 | 6346995.714286 | 57.344525 | 56.343689 | 7185249.213115 | 61.154234 | 60.290312 | 6438985.9375 | 64.032469 | 63.298719 | 5463859.625 | 62.576143 | 61.735317 | 6994040.809524 | 70.895935 | 70.206661 | 6767171.483871 | 72.274222 | 71.559794 | 5800049.904762 | 66.655587 | 65.611365 | 8136257.253968 | 64.755757 | 63.85896 | 6578395.933333 |
Next is the most important cleaning of the data: removing empty/null values. This section deletes any stock’s row that contains any NaNs. NaN values which were in this dataset means that we either had values of companies which weren’t created till after 2013 or shut down before 2017. This meant that these values would’ve skewed our analysis hence why we removed them.
#Clean and preprocess the data to handle missing values and normalize features
#removing any tickers that have 1 or more NaN values in their data
cleaned_stocks_df = stocks_df.dropna()
cleaned_stocks_df.head(20)
| Symbol | Security | GICS Sector | 2013 Q1 Mean High | 2013 Q1 Mean Low | 2013 Q1 Mean Volume | 2013 Q2 Mean High | 2013 Q2 Mean Low | 2013 Q2 Mean Volume | 2013 Q3 Mean High | 2013 Q3 Mean Low | 2013 Q3 Mean Volume | 2013 Q4 Mean High | 2013 Q4 Mean Low | 2013 Q4 Mean Volume | 2014 Q1 Mean High | 2014 Q1 Mean Low | 2014 Q1 Mean Volume | 2014 Q2 Mean High | 2014 Q2 Mean Low | 2014 Q2 Mean Volume | 2014 Q3 Mean High | 2014 Q3 Mean Low | 2014 Q3 Mean Volume | 2014 Q4 Mean High | 2014 Q4 Mean Low | 2014 Q4 Mean Volume | 2015 Q1 Mean High | 2015 Q1 Mean Low | 2015 Q1 Mean Volume | 2015 Q2 Mean High | 2015 Q2 Mean Low | 2015 Q2 Mean Volume | 2015 Q3 Mean High | 2015 Q3 Mean Low | 2015 Q3 Mean Volume | 2015 Q4 Mean High | 2015 Q4 Mean Low | 2015 Q4 Mean Volume | 2016 Q1 Mean High | 2016 Q1 Mean Low | 2016 Q1 Mean Volume | 2016 Q2 Mean High | 2016 Q2 Mean Low | 2016 Q2 Mean Volume | 2016 Q3 Mean High | 2016 Q3 Mean Low | 2016 Q3 Mean Volume | 2016 Q4 Mean High | 2016 Q4 Mean Low | 2016 Q4 Mean Volume | 2017 Q1 Mean High | 2017 Q1 Mean Low | 2017 Q1 Mean Volume | 2017 Q2 Mean High | 2017 Q2 Mean Low | 2017 Q2 Mean Volume | 2017 Q3 Mean High | 2017 Q3 Mean Low | 2017 Q3 Mean Volume | 2017 Q4 Mean High | 2017 Q4 Mean Low | 2017 Q4 Mean Volume | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | MMM | 3M | Industrials | 91.312167 | 90.39165 | 3000908.633333 | 98.020156 | 96.832453 | 3184629.796875 | 105.455313 | 104.375 | 2593571.890625 | 115.978413 | 114.62381 | 3003412.984127 | 122.193607 | 120.529344 | 3286431.163934 | 129.457937 | 128.161111 | 2362018.47619 | 133.417656 | 132.191094 | 2302554.203125 | 143.35873 | 141.33619 | 3041470.253968 | 155.016885 | 152.888197 | 2458525.803279 | 151.794762 | 150.024921 | 2440807.730159 | 140.275156 | 138.233281 | 2956910.703125 | 147.416667 | 145.417778 | 2596462.174603 | 148.371311 | 146.186393 | 2821154.639344 | 163.714844 | 161.959375 | 1975933.640625 | 174.425313 | 172.915781 | 1730216.75 | 169.834127 | 167.954603 | 2021322.095238 | 181.455806 | 179.752903 | 1989174.451613 | 199.043492 | 197.367778 | 1849468.174603 | 207.525206 | 205.418159 | 1684750.68254 | 225.0516 | 222.543667 | 1925851.566667 |
| 1 | AOS | A. O. Smith | Industrials | 16.693183 | 16.394633 | 1580067.933333 | 17.946859 | 17.548625 | 1883052.6875 | 20.077312 | 19.714828 | 1230033.8125 | 24.618 | 24.195079 | 1645559.492063 | 23.568213 | 23.127361 | 1678852.622951 | 23.258302 | 22.849127 | 1349398.031746 | 23.570563 | 23.206109 | 1015275.21875 | 25.370365 | 24.94981 | 999006.349206 | 29.686967 | 29.156984 | 1118223.540984 | 33.707444 | 33.210063 | 1020983.460317 | 33.785922 | 32.879422 | 1674719.625 | 37.247635 | 36.402302 | 1528875.650794 | 34.733787 | 33.713066 | 1840718.47541 | 40.014906 | 39.273094 | 1472523.25 | 46.386844 | 45.709766 | 1067305.671875 | 48.150698 | 47.271032 | 1280703.47619 | 49.546 | 48.752032 | 919397.596774 | 54.049048 | 53.272238 | 869093.190476 | 56.258095 | 55.529968 | 1342885.873016 | 60.550513 | 59.777517 | 862284.733333 |
| 2 | ABT | Abbott | Health Care | 30.724817 | 30.309883 | 12451680.116667 | 33.482594 | 32.897703 | 7951001.546875 | 32.113391 | 31.673766 | 8005789.8125 | 33.834984 | 33.331286 | 7459989.047619 | 35.685213 | 35.102541 | 9174882.016393 | 36.46273 | 35.985317 | 6591339.587302 | 39.491984 | 39.013391 | 5035284.0 | 40.963206 | 40.289476 | 5394024.698413 | 43.385066 | 42.705344 | 5984359.327869 | 45.528857 | 44.942651 | 4981663.079365 | 45.127313 | 44.260953 | 7043142.28125 | 42.363111 | 41.620952 | 6665399.507937 | 38.360328 | 37.510508 | 8101788.295082 | 38.6665 | 38.062969 | 10643153.84375 | 41.879109 | 41.256594 | 9773374.203125 | 39.127889 | 38.410063 | 8882569.047619 | 42.683161 | 42.123548 | 8615529.145161 | 45.023984 | 44.466397 | 6880527.587302 | 50.187952 | 49.569937 | 5433541.793651 | 55.202797 | 54.498413 | 5491270.8 |
| 3 | ABBV | AbbVie | Health Care | 31.711379 | 30.96831 | 10897951.293103 | 37.874344 | 36.874609 | 7384147.078125 | 38.554094 | 37.840172 | 4690873.125 | 43.141349 | 42.268556 | 6319228.111111 | 44.859967 | 43.921311 | 7609473.032787 | 46.878524 | 45.901365 | 6642542.222222 | 49.877344 | 48.994797 | 13670690.390625 | 57.269476 | 55.990286 | 13088959.507937 | 55.40041 | 54.023197 | 13034415.47541 | 60.105444 | 59.026444 | 12052673.666667 | 60.615578 | 59.162406 | 10942893.65625 | 54.219746 | 52.613365 | 9892407.507937 | 53.032361 | 51.245393 | 9461727.672131 | 58.174437 | 56.981422 | 9176243.015625 | 62.272141 | 61.371672 | 7220892.84375 | 59.454603 | 58.236333 | 7871293.349206 | 61.587597 | 60.726435 | 7046768.645161 | 66.49673 | 65.606238 | 5782878.920635 | 76.087587 | 74.74454 | 6246636.222222 | 93.05493 | 90.88035 | 6171383.966667 |
| 4 | ACN | Accenture | Information Technology | 65.413133 | 64.46965 | 2958136.65 | 71.669828 | 70.472016 | 3545633.890625 | 66.856953 | 65.881375 | 3753502.28125 | 68.918381 | 67.920349 | 3565505.31746 | 75.105098 | 73.927754 | 3385196.557377 | 74.092238 | 73.077921 | 3025692.761905 | 74.035734 | 73.190891 | 2751022.140625 | 77.558571 | 76.471746 | 3074781.619048 | 83.717639 | 82.393557 | 2900945.442623 | 90.404254 | 89.257111 | 2328701.730159 | 94.127406 | 92.481 | 2749655.25 | 100.857413 | 99.433635 | 2597977.174603 | 98.467754 | 96.567508 | 3563175.213115 | 111.829062 | 110.387188 | 2252809.296875 | 110.0425 | 108.5325 | 2355634.875 | 116.823968 | 115.13 | 2508397.920635 | 117.692258 | 116.198548 | 2577157.33871 | 121.100476 | 119.760794 | 2231790.698413 | 129.643016 | 128.244762 | 1905023.0 | 140.0467 | 138.514 | 1750601.6 |
| 5 | ADBE | Adobe Inc. | Information Technology | 39.7829 | 39.11015 | 3569916.466667 | 44.708719 | 43.920484 | 3709883.078125 | 48.072375 | 47.381406 | 3170725.46875 | 55.64381 | 54.627508 | 3985544.47619 | 64.792295 | 63.390164 | 3924305.557377 | 65.134952 | 63.753222 | 3614820.587302 | 71.433422 | 70.293672 | 2636903.1875 | 70.770476 | 69.400349 | 3332754.111111 | 75.196279 | 73.791721 | 2366642.131148 | 78.606508 | 77.328 | 2462360.793651 | 81.957266 | 80.139375 | 3419304.015625 | 90.60054 | 88.735 | 2952103.492063 | 87.827552 | 85.323174 | 3681589.95082 | 96.7314 | 95.154775 | 2111794.109375 | 100.877748 | 99.587905 | 1807114.6875 | 106.782841 | 105.022873 | 2293033.888889 | 117.874855 | 116.368194 | 2306142.419355 | 137.783524 | 135.854397 | 2570180.857143 | 150.173397 | 148.012063 | 1972602.984127 | 166.874633 | 163.9689 | 3072533.633333 |
| 6 | AMD | Advanced Micro Devices | Information Technology | 2.672167 | 2.564667 | 18729917.45 | 3.521875 | 3.371563 | 28015006.546875 | 3.874219 | 3.744844 | 29267395.953125 | 3.633492 | 3.531429 | 24142526.84127 | 3.882131 | 3.758033 | 25269923.836066 | 4.116508 | 3.99381 | 21170687.333333 | 4.112656 | 3.992812 | 29876772.28125 | 2.809206 | 2.703492 | 18739877.619048 | 2.868279 | 2.74918 | 17290156.918033 | 2.463016 | 2.385556 | 14835027.238095 | 1.936641 | 1.833281 | 13248920.109375 | 2.268175 | 2.167619 | 9464912.730159 | 2.35041 | 2.198033 | 14099698.442623 | 3.985508 | 3.764922 | 22684549.421875 | 6.523773 | 6.237422 | 35252396.84375 | 8.582698 | 8.168543 | 46450313.365079 | 12.959565 | 12.396373 | 64543556.741935 | 12.506833 | 11.939441 | 76875665.444444 | 13.323238 | 12.81497 | 64954818.825397 | 12.990133 | 12.568997 | 61987566.333333 |
| 7 | AES | AES Corporation | Utilities | 10.10842 | 9.914168 | 7043960.75 | 11.204391 | 10.953047 | 6695538.890625 | 11.321672 | 11.107187 | 4528306.390625 | 12.618524 | 12.345254 | 6451758.952381 | 12.553852 | 12.295115 | 6094453.245902 | 12.804984 | 12.569683 | 6143209.666667 | 13.290375 | 13.076266 | 4156436.75 | 12.298159 | 11.993286 | 6739359.666667 | 11.363377 | 11.076623 | 7149936.786885 | 12.214016 | 11.966571 | 6735586.222222 | 11.136202 | 10.871042 | 6675684.125 | 9.324605 | 9.074429 | 6452733.015873 | 9.368844 | 9.050907 | 6921824.786885 | 10.665641 | 10.432906 | 5180647.8125 | 11.885609 | 11.656453 | 4655622.859375 | 11.352571 | 11.085476 | 5835446.761905 | 11.157871 | 10.906742 | 5824308.983871 | 11.298889 | 11.09727 | 5616984.539683 | 11.091317 | 10.909143 | 4938647.380952 | 10.9345 | 10.755767 | 5099342.833333 |
| 8 | AFL | Aflac | Financials | 46.2141 | 45.433267 | 4104190.6 | 49.176641 | 48.289062 | 3435438.578125 | 54.800359 | 54.012328 | 2189281.828125 | 59.970825 | 59.241762 | 2141384.761905 | 58.619787 | 57.822098 | 2175752.688525 | 57.771492 | 57.017746 | 1926278.460317 | 56.681 | 56.033297 | 2120122.046875 | 55.331175 | 54.628984 | 2444321.063492 | 57.613 | 56.80082 | 2244606.655738 | 60.131063 | 59.417365 | 2073244.571429 | 57.887156 | 56.950672 | 2387365.484375 | 59.969111 | 59.055286 | 2358668.0 | 57.645623 | 56.610869 | 2563169.836066 | 66.453531 | 65.644219 | 2310928.90625 | 71.264313 | 70.539875 | 1739223.609375 | 69.491175 | 68.60873 | 2020415.47619 | 70.262032 | 69.446823 | 2214819.548387 | 74.90873 | 74.134111 | 1846555.492063 | 80.695032 | 79.83694 | 1541004.079365 | 84.0718 | 83.197033 | 1440761.533333 |
| 9 | A | Agilent Technologies | Health Care | 29.667933 | 29.165533 | 5100460.5 | 30.035625 | 29.418578 | 5595136.578125 | 32.569641 | 32.047453 | 3832068.15625 | 36.757286 | 36.169667 | 3353795.761905 | 40.050869 | 39.277967 | 4170020.704918 | 39.052714 | 38.360952 | 2721890.15873 | 39.777484 | 39.240562 | 2585435.890625 | 39.484873 | 38.748032 | 3063606.825397 | 39.669721 | 38.975672 | 2532460.409836 | 40.752206 | 40.151476 | 2448139.761905 | 37.349875 | 36.627813 | 2521089.796875 | 38.163619 | 37.420571 | 3004962.285714 | 37.845246 | 36.91459 | 2357458.967213 | 42.781375 | 42.130453 | 2219362.46875 | 46.398047 | 45.748719 | 1877364.125 | 45.384619 | 44.604825 | 1990839.174603 | 50.474952 | 49.724629 | 2029887.822581 | 57.100619 | 56.334349 | 2009349.047619 | 62.56181 | 61.779984 | 1643051.47619 | 67.413667 | 66.701367 | 1397107.4 |
| 10 | APD | Air Products and Chemicals | Materials | 77.760883 | 76.798833 | 1093138.95 | 81.289891 | 80.249953 | 1552260.046875 | 92.838203 | 91.196234 | 2046325.296875 | 98.905492 | 97.454413 | 1143625.253968 | 104.973115 | 103.321607 | 1455185.934426 | 110.916032 | 109.299365 | 1199132.857143 | 122.2025 | 120.556875 | 1628685.890625 | 127.274921 | 125.009206 | 1710348.936508 | 139.970164 | 137.682459 | 1108369.032787 | 138.894444 | 136.88254 | 1078376.428571 | 130.466094 | 127.980312 | 1500850.15625 | 129.770635 | 127.481429 | 1433344.746032 | 125.588852 | 122.712131 | 1905698.672131 | 138.716563 | 136.732812 | 1064761.28125 | 145.268594 | 143.489688 | 1379288.796875 | 137.335556 | 135.394921 | 1452917.984127 | 139.034194 | 137.274355 | 1545373.048387 | 141.140476 | 139.533016 | 1303113.492063 | 146.273016 | 144.856825 | 1410870.507937 | 156.523133 | 155.0267 | 926571.433333 |
| 11 | AKAM | Akamai | Information Technology | 38.345833 | 37.470117 | 3942939.716667 | 42.344391 | 41.340719 | 2741468.78125 | 47.475531 | 46.609094 | 1724353.609375 | 47.71781 | 46.69873 | 2251961.84127 | 55.955328 | 54.503295 | 2719331.836066 | 56.288587 | 54.859762 | 2191404.68254 | 60.705016 | 59.619875 | 1624116.5 | 60.925302 | 59.595159 | 1622168.206349 | 66.694918 | 65.303443 | 1631151.639344 | 74.717937 | 73.624524 | 1419948.698413 | 72.898438 | 71.173375 | 1800375.140625 | 62.297746 | 60.930667 | 2270051.634921 | 51.614267 | 49.998952 | 2834324.114754 | 53.284647 | 51.860842 | 1531503.703125 | 54.033789 | 53.00923 | 1895174.203125 | 64.308589 | 62.977471 | 2097015.238095 | 66.29605 | 65.184232 | 1916476.16129 | 53.296676 | 52.291944 | 2627907.0 | 48.271071 | 47.413467 | 1919155.206349 | 52.275247 | 51.428977 | 2122564.7 |
| 12 | ALB | Albemarle Corporation | Materials | 59.422 | 58.35975 | 738925.166667 | 58.865891 | 57.647281 | 798707.59375 | 59.128953 | 58.218312 | 804109.703125 | 62.520635 | 61.596651 | 662333.936508 | 61.266967 | 60.274475 | 631124.557377 | 65.074079 | 64.164619 | 533168.222222 | 62.160953 | 60.983437 | 1437183.359375 | 56.594016 | 55.267 | 1462997.079365 | 52.807508 | 51.484361 | 2007614.065574 | 57.841937 | 56.750413 | 1095955.634921 | 47.790344 | 46.420312 | 1471876.21875 | 50.553 | 49.121714 | 1480702.650794 | 54.459115 | 52.543164 | 1433544.245902 | 72.837594 | 71.230234 | 1899285.328125 | 81.901984 | 80.286344 | 1283941.09375 | 84.576016 | 82.875873 | 888396.555556 | 97.426516 | 95.477742 | 1036087.532258 | 109.16 | 107.409206 | 1023728.301587 | 119.617524 | 117.308175 | 1430350.84127 | 140.3418 | 137.821167 | 1259730.2 |
| 13 | ARE | Alexandria Real Estate Equities | Real Estate | 61.36855 | 60.743033 | 510877.366667 | 61.040516 | 59.928797 | 812680.375 | 57.557531 | 56.515922 | 497802.3125 | 56.650143 | 55.566413 | 487156.761905 | 62.462639 | 61.485738 | 523381.442623 | 67.283714 | 66.416952 | 380025.349206 | 70.467438 | 69.669313 | 459933.390625 | 76.38346 | 75.279413 | 490275.349206 | 89.050443 | 87.467197 | 471547.639344 | 86.529873 | 85.126968 | 399307.920635 | 84.633359 | 83.013094 | 364752.734375 | 85.659873 | 84.143825 | 468813.809524 | 77.99 | 76.298508 | 515845.04918 | 91.893234 | 90.337047 | 623097.203125 | 105.790109 | 103.950516 | 587047.640625 | 105.292381 | 103.161841 | 609395.222222 | 111.310645 | 109.527903 | 1028053.66129 | 115.439048 | 113.911905 | 727220.412698 | 120.099048 | 118.52127 | 589167.52381 | 123.519667 | 122.114167 | 492825.366667 |
| 14 | ALGN | Align Technology | Health Care | 31.159017 | 30.377733 | 965606.483333 | 34.975797 | 33.909 | 932521.21875 | 44.089359 | 43.058906 | 623338.703125 | 54.480365 | 53.042889 | 1102201.47619 | 56.988443 | 55.11541 | 1269293.868852 | 52.782825 | 51.130048 | 1145288.68254 | 55.01075 | 53.947656 | 664095.84375 | 53.482111 | 52.269952 | 715800.730159 | 57.678639 | 56.164098 | 932696.57377 | 59.708095 | 58.407381 | 876989.904762 | 60.373438 | 58.926797 | 730518.59375 | 64.794143 | 63.197921 | 649318.666667 | 66.427911 | 64.275434 | 654232.180328 | 77.125 | 75.716822 | 578877.9375 | 91.181948 | 89.562223 | 596522.5 | 94.061211 | 91.813589 | 876983.809524 | 101.910419 | 99.875898 | 787098.354839 | 137.06419 | 134.170984 | 1255167.301587 | 172.816968 | 169.111603 | 868528.095238 | 213.796 | 208.346733 | 1115914.7 |
| 16 | LNT | Alliant Energy | Utilities | 19.833667 | 19.630533 | 1096029.6 | 21.7005 | 21.357781 | 1031537.703125 | 22.163375 | 21.863391 | 1133217.8125 | 22.626413 | 22.287175 | 1118667.571429 | 23.384557 | 23.072443 | 1122770.180328 | 25.851143 | 25.529857 | 1148605.238095 | 25.994484 | 25.654 | 1289851.59375 | 28.091159 | 27.61719 | 1222716.936508 | 29.901721 | 29.356951 | 1519714.639344 | 28.033143 | 27.640937 | 1334767.587302 | 27.474594 | 27.022156 | 1400986.171875 | 28.269857 | 27.814222 | 1324544.444444 | 32.027246 | 31.455148 | 1505386.147541 | 35.556656 | 34.981984 | 2127155.546875 | 37.931031 | 37.304672 | 1506166.765625 | 35.969651 | 35.375222 | 1294674.31746 | 37.607484 | 37.134 | 1213372.370968 | 39.966476 | 39.541794 | 1381354.412698 | 41.404048 | 40.971587 | 1011096.301587 | 43.097533 | 42.592293 | 1377366.533333 |
| 17 | ALL | Allstate | Financials | 41.775167 | 41.2432 | 3434803.983333 | 45.130828 | 44.478172 | 3923570.75 | 46.851547 | 46.229 | 2952061.890625 | 49.845937 | 49.225032 | 2502155.619048 | 50.444 | 49.757066 | 3070192.934426 | 54.400365 | 53.872794 | 2482041.920635 | 56.904359 | 56.361063 | 2327236.03125 | 62.651508 | 61.936 | 2418033.063492 | 67.630475 | 66.660803 | 2569832.672131 | 65.974413 | 65.215794 | 2616602.15873 | 60.50425 | 59.549359 | 3279574.0625 | 60.445508 | 59.56127 | 2817078.380952 | 61.341361 | 60.395262 | 2948406.852459 | 65.640938 | 64.841594 | 1997076.203125 | 67.744703 | 67.177312 | 1629849.046875 | 69.673952 | 68.879175 | 2349906.809524 | 78.158903 | 77.394903 | 1819662.048387 | 84.550587 | 83.749984 | 1847252.634921 | 91.21501 | 90.157444 | 1960922.936508 | 94.74633 | 93.664003 | 1809744.966667 |
| 18 | GOOGL | Alphabet Inc. (Class A) | Communication Services | 391.94 | 386.54 | 4710365.566667 | 427.770781 | 420.828594 | 4744065.375 | 446.389219 | 440.9575 | 3683857.9375 | 510.070476 | 503.04619 | 3619180.253968 | 589.497869 | 580.325738 | 4558310.721311 | 557.104603 | 545.903333 | 2446307.904762 | 593.085625 | 585.090781 | 1612151.828125 | 551.79746 | 542.204127 | 2198303.619048 | 546.473607 | 536.552951 | 2368245.360656 | 553.551905 | 545.467778 | 1767785.650794 | 651.789531 | 636.782187 | 2616696.96875 | 745.732619 | 731.616508 | 2054101.825397 | 745.071836 | 727.587508 | 2133765.016393 | 739.610859 | 728.390531 | 1371835.65625 | 785.743031 | 777.097047 | 1061037.15625 | 806.646873 | 793.98481 | 1732969.52381 | 844.654742 | 835.665145 | 1500393.129032 | 941.14981 | 928.679 | 1699425.68254 | 954.277016 | 941.745508 | 1649185.634921 | 1018.2223 | 1004.470233 | 1591438.266667 |
| 19 | MO | Altria | Consumer Staples | 27.804733 | 27.513033 | 13021175.25 | 30.020234 | 29.596156 | 11530296.125 | 29.783438 | 29.420891 | 10201472.640625 | 31.517444 | 31.145762 | 8860092.920635 | 31.564984 | 31.140131 | 10249242.163934 | 35.299429 | 34.885984 | 8557514.396825 | 38.049359 | 37.613687 | 7041493.1875 | 43.771159 | 43.163921 | 7621816.238095 | 48.308754 | 47.592033 | 8041661.639344 | 46.593317 | 45.944587 | 7200026.952381 | 49.801453 | 48.957937 | 7766711.328125 | 54.317063 | 53.491048 | 6346995.714286 | 57.344525 | 56.343689 | 7185249.213115 | 61.154234 | 60.290312 | 6438985.9375 | 64.032469 | 63.298719 | 5463859.625 | 62.576143 | 61.735317 | 6994040.809524 | 70.895935 | 70.206661 | 6767171.483871 | 72.274222 | 71.559794 | 5800049.904762 | 66.655587 | 65.611365 | 8136257.253968 | 64.755757 | 63.85896 | 6578395.933333 |
| 20 | AMZN | Amazon | Consumer Discretionary | 269.118 | 262.854 | 3522848.05 | 269.194062 | 263.381406 | 3039878.9375 | 300.553125 | 294.989531 | 2420409.5625 | 362.279683 | 354.804127 | 2940043.174603 | 375.857541 | 366.867541 | 3931548.016393 | 320.385238 | 311.777143 | 4657967.349206 | 335.560938 | 329.103437 | 3587100.65625 | 315.103968 | 307.935556 | 4204031.571429 | 355.356557 | 347.714426 | 3923801.163934 | 421.807937 | 414.614286 | 3037333.206349 | 512.614375 | 500.030937 | 4089669.734375 | 635.442857 | 621.514841 | 4296253.888889 | 576.212148 | 557.810295 | 4989470.04918 | 681.847781 | 670.028109 | 3135022.640625 | 769.382875 | 760.123859 | 2378967.3125 | 791.033063 | 777.629952 | 3940060.904762 | 836.813048 | 827.159419 | 3134660.548387 | 961.248302 | 946.97246 | 3873340.206349 | 989.650429 | 975.274317 | 3407886.507937 | 1038.752167 | 1022.620467 | 3624009.266667 |
For this tutorial’s purposes, we want just an average price for each stock for each quarter from 2014 to 2017. This block averages the quarterly mean high and mean low of each stock and places it into a new column just called Avg.
#Clean the data so that we average the high and low of each quarter to get one fixed average stock price.
years = list(range(2013, 2018))
quarters = ["Q1", "Q2", "Q3", "Q4"]
for year in years:
for quarter in quarters:
year_quarter = f"{year} {quarter}"
high_col = f"{year_quarter} Mean High"
low_col = f"{year_quarter} Mean Low"
high = cleaned_stocks_df[high_col]
low = cleaned_stocks_df[low_col]
col_index = cleaned_stocks_df.columns.get_loc(high_col)
cleaned_stocks_df.insert(col_index, f"{year_quarter} Avg", (high + low) / 2)
cleaned_stocks_df = cleaned_stocks_df.drop([high_col, low_col], axis=1)
cleaned_stocks_df.head(20)
| Symbol | Security | GICS Sector | 2013 Q1 Avg | 2013 Q1 Mean Volume | 2013 Q2 Avg | 2013 Q2 Mean Volume | 2013 Q3 Avg | 2013 Q3 Mean Volume | 2013 Q4 Avg | 2013 Q4 Mean Volume | 2014 Q1 Avg | 2014 Q1 Mean Volume | 2014 Q2 Avg | 2014 Q2 Mean Volume | 2014 Q3 Avg | 2014 Q3 Mean Volume | 2014 Q4 Avg | 2014 Q4 Mean Volume | 2015 Q1 Avg | 2015 Q1 Mean Volume | 2015 Q2 Avg | 2015 Q2 Mean Volume | 2015 Q3 Avg | 2015 Q3 Mean Volume | 2015 Q4 Avg | 2015 Q4 Mean Volume | 2016 Q1 Avg | 2016 Q1 Mean Volume | 2016 Q2 Avg | 2016 Q2 Mean Volume | 2016 Q3 Avg | 2016 Q3 Mean Volume | 2016 Q4 Avg | 2016 Q4 Mean Volume | 2017 Q1 Avg | 2017 Q1 Mean Volume | 2017 Q2 Avg | 2017 Q2 Mean Volume | 2017 Q3 Avg | 2017 Q3 Mean Volume | 2017 Q4 Avg | 2017 Q4 Mean Volume | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | MMM | 3M | Industrials | 90.851908 | 3000908.633333 | 97.426305 | 3184629.796875 | 104.915156 | 2593571.890625 | 115.301111 | 3003412.984127 | 121.361475 | 3286431.163934 | 128.809524 | 2362018.47619 | 132.804375 | 2302554.203125 | 142.34746 | 3041470.253968 | 153.952541 | 2458525.803279 | 150.909841 | 2440807.730159 | 139.254219 | 2956910.703125 | 146.417222 | 2596462.174603 | 147.278852 | 2821154.639344 | 162.837109 | 1975933.640625 | 173.670547 | 1730216.75 | 168.894365 | 2021322.095238 | 180.604355 | 1989174.451613 | 198.205635 | 1849468.174603 | 206.471683 | 1684750.68254 | 223.797633 | 1925851.566667 |
| 1 | AOS | A. O. Smith | Industrials | 16.543908 | 1580067.933333 | 17.747742 | 1883052.6875 | 19.89607 | 1230033.8125 | 24.40654 | 1645559.492063 | 23.347787 | 1678852.622951 | 23.053714 | 1349398.031746 | 23.388336 | 1015275.21875 | 25.160087 | 999006.349206 | 29.421975 | 1118223.540984 | 33.458754 | 1020983.460317 | 33.332672 | 1674719.625 | 36.824968 | 1528875.650794 | 34.223426 | 1840718.47541 | 39.644 | 1472523.25 | 46.048305 | 1067305.671875 | 47.710865 | 1280703.47619 | 49.149016 | 919397.596774 | 53.660643 | 869093.190476 | 55.894032 | 1342885.873016 | 60.164015 | 862284.733333 |
| 2 | ABT | Abbott | Health Care | 30.51735 | 12451680.116667 | 33.190148 | 7951001.546875 | 31.893578 | 8005789.8125 | 33.583135 | 7459989.047619 | 35.393877 | 9174882.016393 | 36.224024 | 6591339.587302 | 39.252688 | 5035284.0 | 40.626341 | 5394024.698413 | 43.045205 | 5984359.327869 | 45.235754 | 4981663.079365 | 44.694133 | 7043142.28125 | 41.992032 | 6665399.507937 | 37.935418 | 8101788.295082 | 38.364734 | 10643153.84375 | 41.567852 | 9773374.203125 | 38.768976 | 8882569.047619 | 42.403355 | 8615529.145161 | 44.74519 | 6880527.587302 | 49.878944 | 5433541.793651 | 54.850605 | 5491270.8 |
| 3 | ABBV | AbbVie | Health Care | 31.339845 | 10897951.293103 | 37.374477 | 7384147.078125 | 38.197133 | 4690873.125 | 42.704952 | 6319228.111111 | 44.390639 | 7609473.032787 | 46.389944 | 6642542.222222 | 49.43607 | 13670690.390625 | 56.629881 | 13088959.507937 | 54.711803 | 13034415.47541 | 59.565944 | 12052673.666667 | 59.888992 | 10942893.65625 | 53.416556 | 9892407.507937 | 52.138877 | 9461727.672131 | 57.57793 | 9176243.015625 | 61.821906 | 7220892.84375 | 58.845468 | 7871293.349206 | 61.157016 | 7046768.645161 | 66.051484 | 5782878.920635 | 75.416063 | 6246636.222222 | 91.96764 | 6171383.966667 |
| 4 | ACN | Accenture | Information Technology | 64.941392 | 2958136.65 | 71.070922 | 3545633.890625 | 66.369164 | 3753502.28125 | 68.419365 | 3565505.31746 | 74.516426 | 3385196.557377 | 73.585079 | 3025692.761905 | 73.613313 | 2751022.140625 | 77.015159 | 3074781.619048 | 83.055598 | 2900945.442623 | 89.830683 | 2328701.730159 | 93.304203 | 2749655.25 | 100.145524 | 2597977.174603 | 97.517631 | 3563175.213115 | 111.108125 | 2252809.296875 | 109.2875 | 2355634.875 | 115.976984 | 2508397.920635 | 116.945403 | 2577157.33871 | 120.430635 | 2231790.698413 | 128.943889 | 1905023.0 | 139.28035 | 1750601.6 |
| 5 | ADBE | Adobe Inc. | Information Technology | 39.446525 | 3569916.466667 | 44.314602 | 3709883.078125 | 47.726891 | 3170725.46875 | 55.135659 | 3985544.47619 | 64.09123 | 3924305.557377 | 64.444087 | 3614820.587302 | 70.863547 | 2636903.1875 | 70.085413 | 3332754.111111 | 74.494 | 2366642.131148 | 77.967254 | 2462360.793651 | 81.04832 | 3419304.015625 | 89.66777 | 2952103.492063 | 86.575363 | 3681589.95082 | 95.943087 | 2111794.109375 | 100.232827 | 1807114.6875 | 105.902857 | 2293033.888889 | 117.121524 | 2306142.419355 | 136.81896 | 2570180.857143 | 149.09273 | 1972602.984127 | 165.421767 | 3072533.633333 |
| 6 | AMD | Advanced Micro Devices | Information Technology | 2.618417 | 18729917.45 | 3.446719 | 28015006.546875 | 3.809531 | 29267395.953125 | 3.58246 | 24142526.84127 | 3.820082 | 25269923.836066 | 4.055159 | 21170687.333333 | 4.052734 | 29876772.28125 | 2.756349 | 18739877.619048 | 2.80873 | 17290156.918033 | 2.424286 | 14835027.238095 | 1.884961 | 13248920.109375 | 2.217897 | 9464912.730159 | 2.274221 | 14099698.442623 | 3.875215 | 22684549.421875 | 6.380598 | 35252396.84375 | 8.375621 | 46450313.365079 | 12.677969 | 64543556.741935 | 12.223137 | 76875665.444444 | 13.069104 | 64954818.825397 | 12.779565 | 61987566.333333 |
| 7 | AES | AES Corporation | Utilities | 10.011294 | 7043960.75 | 11.078719 | 6695538.890625 | 11.21443 | 4528306.390625 | 12.481889 | 6451758.952381 | 12.424484 | 6094453.245902 | 12.687333 | 6143209.666667 | 13.18332 | 4156436.75 | 12.145722 | 6739359.666667 | 11.22 | 7149936.786885 | 12.090294 | 6735586.222222 | 11.003622 | 6675684.125 | 9.199517 | 6452733.015873 | 9.209875 | 6921824.786885 | 10.549273 | 5180647.8125 | 11.771031 | 4655622.859375 | 11.219024 | 5835446.761905 | 11.032306 | 5824308.983871 | 11.198079 | 5616984.539683 | 11.00023 | 4938647.380952 | 10.845133 | 5099342.833333 |
| 8 | AFL | Aflac | Financials | 45.823683 | 4104190.6 | 48.732852 | 3435438.578125 | 54.406344 | 2189281.828125 | 59.606294 | 2141384.761905 | 58.220943 | 2175752.688525 | 57.394619 | 1926278.460317 | 56.357148 | 2120122.046875 | 54.980079 | 2444321.063492 | 57.20691 | 2244606.655738 | 59.774214 | 2073244.571429 | 57.418914 | 2387365.484375 | 59.512198 | 2358668.0 | 57.128246 | 2563169.836066 | 66.048875 | 2310928.90625 | 70.902094 | 1739223.609375 | 69.049952 | 2020415.47619 | 69.854427 | 2214819.548387 | 74.521421 | 1846555.492063 | 80.265986 | 1541004.079365 | 83.634417 | 1440761.533333 |
| 9 | A | Agilent Technologies | Health Care | 29.416733 | 5100460.5 | 29.727102 | 5595136.578125 | 32.308547 | 3832068.15625 | 36.463476 | 3353795.761905 | 39.664418 | 4170020.704918 | 38.706833 | 2721890.15873 | 39.509023 | 2585435.890625 | 39.116452 | 3063606.825397 | 39.322697 | 2532460.409836 | 40.451841 | 2448139.761905 | 36.988844 | 2521089.796875 | 37.792095 | 3004962.285714 | 37.379918 | 2357458.967213 | 42.455914 | 2219362.46875 | 46.073383 | 1877364.125 | 44.994722 | 1990839.174603 | 50.09979 | 2029887.822581 | 56.717484 | 2009349.047619 | 62.170897 | 1643051.47619 | 67.057517 | 1397107.4 |
| 10 | APD | Air Products and Chemicals | Materials | 77.279858 | 1093138.95 | 80.769922 | 1552260.046875 | 92.017219 | 2046325.296875 | 98.179952 | 1143625.253968 | 104.147361 | 1455185.934426 | 110.107698 | 1199132.857143 | 121.379687 | 1628685.890625 | 126.142063 | 1710348.936508 | 138.826311 | 1108369.032787 | 137.888492 | 1078376.428571 | 129.223203 | 1500850.15625 | 128.626032 | 1433344.746032 | 124.150492 | 1905698.672131 | 137.724688 | 1064761.28125 | 144.379141 | 1379288.796875 | 136.365238 | 1452917.984127 | 138.154274 | 1545373.048387 | 140.336746 | 1303113.492063 | 145.564921 | 1410870.507937 | 155.774917 | 926571.433333 |
| 11 | AKAM | Akamai | Information Technology | 37.907975 | 3942939.716667 | 41.842555 | 2741468.78125 | 47.042313 | 1724353.609375 | 47.20827 | 2251961.84127 | 55.229311 | 2719331.836066 | 55.574175 | 2191404.68254 | 60.162445 | 1624116.5 | 60.26023 | 1622168.206349 | 65.99918 | 1631151.639344 | 74.17123 | 1419948.698413 | 72.035906 | 1800375.140625 | 61.614206 | 2270051.634921 | 50.80661 | 2834324.114754 | 52.572745 | 1531503.703125 | 53.521509 | 1895174.203125 | 63.64303 | 2097015.238095 | 65.740141 | 1916476.16129 | 52.79431 | 2627907.0 | 47.842269 | 1919155.206349 | 51.852112 | 2122564.7 |
| 12 | ALB | Albemarle Corporation | Materials | 58.890875 | 738925.166667 | 58.256586 | 798707.59375 | 58.673633 | 804109.703125 | 62.058643 | 662333.936508 | 60.770721 | 631124.557377 | 64.619349 | 533168.222222 | 61.572195 | 1437183.359375 | 55.930508 | 1462997.079365 | 52.145934 | 2007614.065574 | 57.296175 | 1095955.634921 | 47.105328 | 1471876.21875 | 49.837357 | 1480702.650794 | 53.501139 | 1433544.245902 | 72.033914 | 1899285.328125 | 81.094164 | 1283941.09375 | 83.725944 | 888396.555556 | 96.452129 | 1036087.532258 | 108.284603 | 1023728.301587 | 118.462849 | 1430350.84127 | 139.081483 | 1259730.2 |
| 13 | ARE | Alexandria Real Estate Equities | Real Estate | 61.055792 | 510877.366667 | 60.484656 | 812680.375 | 57.036727 | 497802.3125 | 56.108278 | 487156.761905 | 61.974189 | 523381.442623 | 66.850333 | 380025.349206 | 70.068375 | 459933.390625 | 75.831437 | 490275.349206 | 88.25882 | 471547.639344 | 85.828421 | 399307.920635 | 83.823227 | 364752.734375 | 84.901849 | 468813.809524 | 77.144254 | 515845.04918 | 91.115141 | 623097.203125 | 104.870312 | 587047.640625 | 104.227111 | 609395.222222 | 110.419274 | 1028053.66129 | 114.675476 | 727220.412698 | 119.310159 | 589167.52381 | 122.816917 | 492825.366667 |
| 14 | ALGN | Align Technology | Health Care | 30.768375 | 965606.483333 | 34.442398 | 932521.21875 | 43.574133 | 623338.703125 | 53.761627 | 1102201.47619 | 56.051926 | 1269293.868852 | 51.956437 | 1145288.68254 | 54.479203 | 664095.84375 | 52.876032 | 715800.730159 | 56.921369 | 932696.57377 | 59.057738 | 876989.904762 | 59.650117 | 730518.59375 | 63.996032 | 649318.666667 | 65.351673 | 654232.180328 | 76.420911 | 578877.9375 | 90.372086 | 596522.5 | 92.9374 | 876983.809524 | 100.893159 | 787098.354839 | 135.617587 | 1255167.301587 | 170.964286 | 868528.095238 | 211.071367 | 1115914.7 |
| 16 | LNT | Alliant Energy | Utilities | 19.7321 | 1096029.6 | 21.529141 | 1031537.703125 | 22.013383 | 1133217.8125 | 22.456794 | 1118667.571429 | 23.2285 | 1122770.180328 | 25.6905 | 1148605.238095 | 25.824242 | 1289851.59375 | 27.854175 | 1222716.936508 | 29.629336 | 1519714.639344 | 27.83704 | 1334767.587302 | 27.248375 | 1400986.171875 | 28.04204 | 1324544.444444 | 31.741197 | 1505386.147541 | 35.26932 | 2127155.546875 | 37.617852 | 1506166.765625 | 35.672437 | 1294674.31746 | 37.370742 | 1213372.370968 | 39.754135 | 1381354.412698 | 41.187817 | 1011096.301587 | 42.844913 | 1377366.533333 |
| 17 | ALL | Allstate | Financials | 41.509183 | 3434803.983333 | 44.8045 | 3923570.75 | 46.540273 | 2952061.890625 | 49.535484 | 2502155.619048 | 50.100533 | 3070192.934426 | 54.136579 | 2482041.920635 | 56.632711 | 2327236.03125 | 62.293754 | 2418033.063492 | 67.145639 | 2569832.672131 | 65.595103 | 2616602.15873 | 60.026805 | 3279574.0625 | 60.003389 | 2817078.380952 | 60.868311 | 2948406.852459 | 65.241266 | 1997076.203125 | 67.461008 | 1629849.046875 | 69.276563 | 2349906.809524 | 77.776903 | 1819662.048387 | 84.150286 | 1847252.634921 | 90.686227 | 1960922.936508 | 94.205167 | 1809744.966667 |
| 18 | GOOGL | Alphabet Inc. (Class A) | Communication Services | 389.24 | 4710365.566667 | 424.299688 | 4744065.375 | 443.673359 | 3683857.9375 | 506.558333 | 3619180.253968 | 584.911803 | 4558310.721311 | 551.503968 | 2446307.904762 | 589.088203 | 1612151.828125 | 547.000794 | 2198303.619048 | 541.513279 | 2368245.360656 | 549.509841 | 1767785.650794 | 644.285859 | 2616696.96875 | 738.674563 | 2054101.825397 | 736.329672 | 2133765.016393 | 734.000695 | 1371835.65625 | 781.420039 | 1061037.15625 | 800.315841 | 1732969.52381 | 840.159944 | 1500393.129032 | 934.914405 | 1699425.68254 | 948.011262 | 1649185.634921 | 1011.346267 | 1591438.266667 |
| 19 | MO | Altria | Consumer Staples | 27.658883 | 13021175.25 | 29.808195 | 11530296.125 | 29.602164 | 10201472.640625 | 31.331603 | 8860092.920635 | 31.352557 | 10249242.163934 | 35.092706 | 8557514.396825 | 37.831523 | 7041493.1875 | 43.46754 | 7621816.238095 | 47.950393 | 8041661.639344 | 46.268952 | 7200026.952381 | 49.379695 | 7766711.328125 | 53.904056 | 6346995.714286 | 56.844107 | 7185249.213115 | 60.722273 | 6438985.9375 | 63.665594 | 5463859.625 | 62.15573 | 6994040.809524 | 70.551298 | 6767171.483871 | 71.917008 | 5800049.904762 | 66.133476 | 8136257.253968 | 64.307358 | 6578395.933333 |
| 20 | AMZN | Amazon | Consumer Discretionary | 265.986 | 3522848.05 | 266.287734 | 3039878.9375 | 297.771328 | 2420409.5625 | 358.541905 | 2940043.174603 | 371.362541 | 3931548.016393 | 316.08119 | 4657967.349206 | 332.332188 | 3587100.65625 | 311.519762 | 4204031.571429 | 351.535492 | 3923801.163934 | 418.211111 | 3037333.206349 | 506.322656 | 4089669.734375 | 628.478849 | 4296253.888889 | 567.011221 | 4989470.04918 | 675.937945 | 3135022.640625 | 764.753367 | 2378967.3125 | 784.331508 | 3940060.904762 | 831.986234 | 3134660.548387 | 954.110381 | 3873340.206349 | 982.462373 | 3407886.507937 | 1030.686317 | 3624009.266667 |
#Identify all GICS Sectors -> 11 sectors total
sectors = cleaned_stocks_df["GICS Sector"]
sectors = list(sectors.sort_values().unique())
sectors
['Communication Services', 'Consumer Discretionary', 'Consumer Staples', 'Energy', 'Financials', 'Health Care', 'Industrials', 'Information Technology', 'Materials', 'Real Estate', 'Utilities']
Finally, after identifying every sector within this data set, we want to create a separate dataframe that contains the quarterly averages for each sector as a whole. This will eventually be useful because it will allow us to analyze and predict entire sectors’ performances.
# high unemployment = low interest rates meaning look at faster increase in discretionary comsumer spending and real estate
# opposite for low unemployment plus a faster increase in healthcare/industrials/info tech
sectors_avg_df = cleaned_stocks_df.loc[:, "GICS Sector":].groupby("GICS Sector").mean().loc[:, ::2].T
# Reformat index column to match the gdp_unemp_df quarter format
def reformat_quarter(original):
original = original.split()
new = f"{original[0]}{original[1]}"
return new
sectors_avg_df.index = sectors_avg_df.index.map(reformat_quarter)
sectors_avg_df.columns.name = ""
sectors_avg_df.index.name = "Quarter"
sectors_avg_df
| Communication Services | Consumer Discretionary | Consumer Staples | Energy | Financials | Health Care | Industrials | Information Technology | Materials | Real Estate | Utilities | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Quarter | |||||||||||
| 2013Q1 | 53.606051 | 83.967426 | 46.972873 | 51.025156 | 43.034767 | 58.227233 | 50.13672 | 37.150871 | 56.204277 | 67.174301 | 35.028688 |
| 2013Q2 | 61.00862 | 89.308101 | 51.391396 | 53.455384 | 46.718175 | 64.098781 | 53.109493 | 38.288068 | 57.968286 | 70.041769 | 37.733436 |
| 2013Q3 | 65.873971 | 93.510542 | 52.518399 | 56.484413 | 50.283462 | 68.028496 | 57.37949 | 40.657204 | 59.85785 | 68.106862 | 37.297389 |
| 2013Q4 | 72.531378 | 102.320923 | 54.638603 | 60.924815 | 54.157675 | 73.369455 | 62.558362 | 43.918417 | 63.990034 | 68.463233 | 37.789467 |
| 2014Q1 | 79.785127 | 111.72085 | 55.017444 | 63.643351 | 55.838741 | 79.980759 | 66.707765 | 48.093618 | 67.619884 | 72.223258 | 39.18897 |
| 2014Q2 | 78.849469 | 109.985092 | 57.418072 | 71.784604 | 56.229804 | 80.388476 | 69.673904 | 49.389855 | 70.97771 | 76.154216 | 42.677177 |
| 2014Q3 | 84.941217 | 115.342292 | 58.62104 | 75.521954 | 58.931846 | 86.80099 | 71.378148 | 52.955226 | 73.199359 | 79.7695 | 43.006188 |
| 2014Q4 | 82.047915 | 119.922544 | 62.203928 | 64.713406 | 61.576242 | 93.153398 | 73.974583 | 54.933903 | 71.670788 | 84.433212 | 46.418571 |
| 2015Q1 | 85.628869 | 127.960974 | 66.312432 | 61.002086 | 63.637636 | 101.919575 | 77.389209 | 59.238162 | 76.910675 | 93.024833 | 48.208254 |
| 2015Q2 | 89.74348 | 132.918902 | 67.514595 | 64.015145 | 65.984916 | 108.131197 | 78.462107 | 62.797285 | 80.508134 | 92.387045 | 45.882142 |
| 2015Q3 | 97.810651 | 140.813396 | 67.645306 | 54.929293 | 65.201938 | 108.228322 | 74.881097 | 60.153303 | 75.929594 | 92.55864 | 45.278884 |
| 2015Q4 | 105.505572 | 145.419427 | 70.150361 | 53.150133 | 66.175417 | 104.478399 | 75.051292 | 61.515457 | 74.955804 | 98.474586 | 46.425979 |
| 2016Q1 | 102.474182 | 137.820155 | 72.457023 | 45.596145 | 60.995942 | 98.875879 | 72.438275 | 56.736884 | 71.177349 | 99.041406 | 50.134951 |
| 2016Q2 | 106.637626 | 146.901684 | 76.714013 | 53.503052 | 66.218309 | 106.447697 | 80.129022 | 60.549082 | 82.362036 | 107.645605 | 54.423475 |
| 2016Q3 | 112.92182 | 151.291667 | 79.984418 | 55.962611 | 69.761744 | 113.4712 | 84.758315 | 64.378083 | 85.308012 | 112.820678 | 56.810371 |
| 2016Q4 | 117.799688 | 148.332804 | 75.535874 | 60.096161 | 74.302982 | 108.67958 | 89.435799 | 68.264731 | 85.779932 | 104.352883 | 53.869594 |
| 2017Q1 | 128.686692 | 160.978991 | 77.995511 | 60.850663 | 81.860864 | 116.30322 | 97.096549 | 74.68916 | 93.027048 | 109.16407 | 57.069938 |
| 2017Q2 | 139.104742 | 176.364536 | 80.152594 | 56.972089 | 83.987991 | 127.73827 | 99.816145 | 80.491237 | 97.689698 | 115.642488 | 60.883284 |
| 2017Q3 | 145.288521 | 180.998835 | 79.467716 | 56.349285 | 89.211384 | 136.841759 | 104.330158 | 85.088161 | 100.97046 | 120.344693 | 63.092819 |
| 2017Q4 | 149.647204 | 193.286556 | 79.599355 | 60.289606 | 94.736015 | 140.991576 | 112.327765 | 91.978299 | 108.369587 | 123.064697 | 64.680491 |
Exploratory Analysis and Data Visualization¶
Initial Visualization¶
5 of the 11 sectors in the S&P 500 are known to fluctuate the most based on unemployment rates. These sectors include Consumer Discretionary, Real Estate, Health Care, Industrials, Information Technology. Our first visualization shows these 5 sectors performance over time. Next is a visualization of natural and underlying long term unemployment over time. We initially did these separately to just look at their general direction of growth alone before we analyzed their effects on each other. More information about these connections can be found here and here.
sectors_gdp_unemp_df = pd.merge(sectors_avg_df, gdp_unemp_df, left_index=True, right_index=True)
unemp_sectors_ax = sectors_gdp_unemp_df[["Consumer Discretionary", "Real Estate", "Health Care", "Industrials", "Information Technology"]].plot(figsize=(10,6))
unemp_sectors_ax.set_xticks(range(len(sectors_gdp_unemp_df)))
unemp_sectors_ax.set_xticklabels(sectors_gdp_unemp_df.index.tolist(), rotation=90)
unemp_ax = sectors_gdp_unemp_df[["Unemployment Underlying Long Term", "Unemployment Natural"]].plot(figsize=(10,6))
unemp_ax.set_xticks(range(len(sectors_gdp_unemp_df)))
unemp_ax.set_xticklabels(sectors_gdp_unemp_df.index.tolist(), rotation=90)
[Text(0, 0, '2013Q1'), Text(1, 0, '2013Q2'), Text(2, 0, '2013Q3'), Text(3, 0, '2013Q4'), Text(4, 0, '2014Q1'), Text(5, 0, '2014Q2'), Text(6, 0, '2014Q3'), Text(7, 0, '2014Q4'), Text(8, 0, '2015Q1'), Text(9, 0, '2015Q2'), Text(10, 0, '2015Q3'), Text(11, 0, '2015Q4'), Text(12, 0, '2016Q1'), Text(13, 0, '2016Q2'), Text(14, 0, '2016Q3'), Text(15, 0, '2016Q4'), Text(16, 0, '2017Q1'), Text(17, 0, '2017Q2'), Text(18, 0, '2017Q3'), Text(19, 0, '2017Q4')]
We then combined these 2 graphs to analyze the economic factors effects on those 5 GICS Sectors from the S&P 500. What we can see here is with higher unemployment rates, we will naturally have lower interest rates. By doing so we will see a faster increase in growth for consumer spending, real estate markets, and information technology. However, the opposite happens for the healthcare and industrial sectors. These sectors as the unemployment decreases tend to have a faster growth and as it starts to plateau or even increase the growth rate for those 2 sectors becomes slower.
#Checking industries which would be affected by Unemployment with the Unemployment Natural.
scaler = MinMaxScaler()
norm_columns = sectors_gdp_unemp_df.columns[1:]
df_normalized = pd.DataFrame(scaler.fit_transform(sectors_gdp_unemp_df[norm_columns]), columns=norm_columns)
df_normalized
unemp_ax = df_normalized[["Consumer Discretionary", "Real Estate", "Health Care", "Industrials", "Information Technology", "Unemployment Natural"]].plot(figsize=(10,6))
unemp_ax.set_xticks(range(len(df_normalized)))
unemp_ax.set_xticklabels(df_normalized.index.tolist(), rotation=90)
[Text(0, 0, '0'), Text(1, 0, '1'), Text(2, 0, '2'), Text(3, 0, '3'), Text(4, 0, '4'), Text(5, 0, '5'), Text(6, 0, '6'), Text(7, 0, '7'), Text(8, 0, '8'), Text(9, 0, '9'), Text(10, 0, '10'), Text(11, 0, '11'), Text(12, 0, '12'), Text(13, 0, '13'), Text(14, 0, '14'), Text(15, 0, '15'), Text(16, 0, '16'), Text(17, 0, '17'), Text(18, 0, '18'), Text(19, 0, '19')]
3 of the 11 sectors in the S&P 500 are known to fluctuate the most based on unemployment rates. These sectors include Energy, Materials, and Industrials. Our first visualization shows these 3 sectors performance over time. Then, a visualization of Real GDP adjusted for inflation as well as Nominal GDP over time.
# looking at gdp down energy/industrials/materials should be down too. gdp down means people abroad are doing better than us.
gdp_sectors_ax = sectors_gdp_unemp_df[["Energy", "Materials", "Industrials"]].plot(figsize=(10,6))
gdp_sectors_ax.set_xticks(range(len(sectors_gdp_unemp_df)))
gdp_sectors_ax.set_xticklabels(sectors_gdp_unemp_df.index.tolist(), rotation=90)
gdp_ax = sectors_gdp_unemp_df[["Real GDP", "Nominal GDP"]].plot(figsize=(10,6))
gdp_ax.set_xticks(range(len(sectors_gdp_unemp_df)))
gdp_ax.set_xticklabels(sectors_gdp_unemp_df.index.tolist(), rotation=90)
[Text(0, 0, '2013Q1'), Text(1, 0, '2013Q2'), Text(2, 0, '2013Q3'), Text(3, 0, '2013Q4'), Text(4, 0, '2014Q1'), Text(5, 0, '2014Q2'), Text(6, 0, '2014Q3'), Text(7, 0, '2014Q4'), Text(8, 0, '2015Q1'), Text(9, 0, '2015Q2'), Text(10, 0, '2015Q3'), Text(11, 0, '2015Q4'), Text(12, 0, '2016Q1'), Text(13, 0, '2016Q2'), Text(14, 0, '2016Q3'), Text(15, 0, '2016Q4'), Text(16, 0, '2017Q1'), Text(17, 0, '2017Q2'), Text(18, 0, '2017Q3'), Text(19, 0, '2017Q4')]
This combined visualization of the adjusted GDP and the typical sectors it affects shows a general positive correlation between the sectors and GDP. The trend should be as GDP is down, energy, industrials, and materials should also be down (in other words this is a linear relationship). A notable exception is the Energy sector. The reason the Energy sector is an exception to this trend is because during mid 2014 and early 2016 where the entire oil industry (which holds a large stake in the energy sector) plummeted. This skewed the data for the Energy sector, but its important to note still since it shows a good example of other factors creating shifts in the market. Article about the oil prices dropping linked here.
#Checking industries which would be effected by GDP with the nominal GDP.
nat_unemp_ax = df_normalized[["Energy", "Materials", "Industrials", "Nominal GDP"]].plot(figsize=(10,6))
nat_unemp_ax.set_xticks(range(len(df_normalized)))
nat_unemp_ax.set_xticklabels(df_normalized.index.tolist(), rotation=90)
[Text(0, 0, '0'), Text(1, 0, '1'), Text(2, 0, '2'), Text(3, 0, '3'), Text(4, 0, '4'), Text(5, 0, '5'), Text(6, 0, '6'), Text(7, 0, '7'), Text(8, 0, '8'), Text(9, 0, '9'), Text(10, 0, '10'), Text(11, 0, '11'), Text(12, 0, '12'), Text(13, 0, '13'), Text(14, 0, '14'), Text(15, 0, '15'), Text(16, 0, '16'), Text(17, 0, '17'), Text(18, 0, '18'), Text(19, 0, '19')]
As an added bonus for visualization, we decided to show each sector’s normalized individual company performance.
norm_cleaned_stocks_df = cleaned_stocks_df
quarterly_avg_cols = [col for col in cleaned_stocks_df.columns if 'Avg' in col]
scaler = MinMaxScaler()
norm_cleaned_stocks_df[quarterly_avg_cols] = scaler.fit_transform(cleaned_stocks_df[quarterly_avg_cols])
for sector in norm_cleaned_stocks_df['GICS Sector'].unique():
sector_data = norm_cleaned_stocks_df[norm_cleaned_stocks_df['GICS Sector'] == sector]
fig, ax = plt.subplots(figsize=(15, 8))
# Plot each company's data points within the sector
for company in sector_data.index:
company_data = sector_data.loc[company, quarterly_avg_cols]
ax.plot(quarterly_avg_cols, company_data, label=sector_data.loc[company, 'Symbol'])
ax.set_xticks(range(len(quarterly_avg_cols)))
ax.set_xticklabels(quarterly_avg_cols, rotation=90)
ax.set_xlabel("Quarter")
ax.set_ylabel("Normalized Value")
ax.set_title(f"Performance of Companies in {sector} Sector Over Time")
plt.show()
Linear Regression¶
This visualization serves to display the general trend of each sector as a whole over the course of this tutorial’s scope. A more in-depth linear regression of the data is displayed later in this tutorial, but this representation is helpful just to show basic sector performance in relation to our economic factors.
#linear regression of the cleaned S&P500 by industry (GICS Cateogry)
sectors_reset_index = sectors_gdp_unemp_df.reset_index()
fig, ax = plt.subplots(figsize=(20, 12))
# Plot each sector
for sector in sectors:
ax.scatter(sectors_reset_index["Quarter"], sectors_reset_index[sector], label=sector)
# Fit linear regression
X = np.arange(len(sectors_reset_index)).reshape(-1, 1)
y = sectors_reset_index[sector].values.reshape(-1, 1)
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
ax.plot(sectors_reset_index["Quarter"], y_pred, label=f'{sector} Trend', linestyle='--')
ax.set_xticks(range(len(sectors_reset_index)))
ax.set_xticklabels(sectors_reset_index["Quarter"], rotation=90)
ax.set_xlabel("Quarter")
ax.set_ylabel("Value")
ax.set_title("Sector Performance Over Time with Linear Regression")
ax.legend()
plt.show()
Explaining the Relevance of a Hypothesis Test¶
Our hypothesis is that there is a linear relationship between our economic factors and the individual sectors in the S&P 500. Using the below formulas, if the calculated p-value is below 0.05, then it is quite likely that there is a linear relationship between the economic factors and the sectors of the S&P 500
$$ \text{variance} = \text{MSE} \times \left( \text{diag}\left( \left( X^T X \right)^{-1} \right) \right) $$ $$ \text{standard\_errors} = \sqrt{\text{variance}} $$ $$ \text{tvalues} = \frac{\text{model.coef}}{\text{standard\_errors}} $$ $$ \text{p\_values} = [2 \times (1 - \text{stats.t.cdf}(|t|, n-k-1)) \text{ for } t \text{ in } \text{t\_values}] $$
Correlation Analysis¶
Our next piece of data analysis includes a correlation matrix detailing the strength of correlation between sectors’ performances, gdp, and unemployment. The matrix also indicates a positive or negative correlation (-1 being the strongest negative correlation and 1 being the strongest positive correlation). For example, we can see a strong negative correlation between unemployment rates and the performance of most sectors. We can also see a strong positive correlation between the GDP and sector performance. Energy showed a poor correlation because of what was discussed about skewed data, above.
Seaborn visualization library found here
To learn more about correlation analysis and regression click here
#Lecture 9 (correlation between 2 varialbes)
df_reset = sectors_gdp_unemp_df.reset_index()
numeric_df = df_reset.drop(columns=['Quarter'], errors='ignore')
correlation_matrix = numeric_df.corr()
print(correlation_matrix)
plt.figure(figsize=(14, 10))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', linewidths=.5)
plt.title('Correlation Matrix of Economic Indicators and Stock Performance')
plt.show()
Communication Services \
Communication Services 1.000000
Consumer Discretionary 0.987924
Consumer Staples 0.935060
Energy -0.101493
Financials 0.978598
Health Care 0.959232
Industrials 0.971948
Information Technology 0.975336
Materials 0.967361
Real Estate 0.957317
Utilities 0.967086
Real GDP 0.981291
Nominal GDP 0.987339
Unemployment Underlying Long Term -0.874037
Unemployment Natural -0.805601
Consumer Discretionary Consumer Staples \
Communication Services 0.987924 0.935060
Consumer Discretionary 1.000000 0.949784
Consumer Staples 0.949784 1.000000
Energy -0.115463 -0.195799
Financials 0.975527 0.894617
Health Care 0.986517 0.942485
Industrials 0.969548 0.901185
Information Technology 0.982833 0.903430
Materials 0.969203 0.898534
Real Estate 0.972393 0.981577
Utilities 0.965671 0.953500
Real GDP 0.983483 0.980323
Nominal GDP 0.986387 0.970951
Unemployment Underlying Long Term -0.910904 -0.966750
Unemployment Natural -0.848124 -0.897720
Energy Financials Health Care \
Communication Services -0.101493 0.978598 0.959232
Consumer Discretionary -0.115463 0.975527 0.986517
Consumer Staples -0.195799 0.894617 0.942485
Energy 1.000000 0.012237 -0.057815
Financials 0.012237 1.000000 0.967304
Health Care -0.057815 0.967304 1.000000
Industrials 0.065185 0.994526 0.964797
Information Technology -0.008027 0.994420 0.980619
Materials 0.056745 0.989092 0.965655
Real Estate -0.213068 0.927934 0.962079
Utilities -0.113515 0.952533 0.951182
Real GDP -0.155328 0.952111 0.966176
Nominal GDP -0.114904 0.964317 0.968687
Unemployment Underlying Long Term 0.194390 -0.833426 -0.919999
Unemployment Natural -0.054436 -0.785732 -0.877693
Industrials Information Technology \
Communication Services 0.971948 0.975336
Consumer Discretionary 0.969548 0.982833
Consumer Staples 0.901185 0.903430
Energy 0.065185 -0.008027
Financials 0.994526 0.994420
Health Care 0.964797 0.980619
Industrials 1.000000 0.990682
Information Technology 0.990682 1.000000
Materials 0.994519 0.989337
Real Estate 0.929761 0.941982
Utilities 0.961877 0.955645
Real GDP 0.952584 0.957551
Nominal GDP 0.965429 0.967403
Unemployment Underlying Long Term -0.836263 -0.855386
Unemployment Natural -0.804078 -0.803824
Materials Real Estate Utilities \
Communication Services 0.967361 0.957317 0.967086
Consumer Discretionary 0.969203 0.972393 0.965671
Consumer Staples 0.898534 0.981577 0.953500
Energy 0.056745 -0.213068 -0.113515
Financials 0.989092 0.927934 0.952533
Health Care 0.965655 0.962079 0.951182
Industrials 0.994519 0.929761 0.961877
Information Technology 0.989337 0.941982 0.955645
Materials 1.000000 0.935021 0.962263
Real Estate 0.935021 1.000000 0.979769
Utilities 0.962263 0.979769 1.000000
Real GDP 0.946958 0.985343 0.974638
Nominal GDP 0.959721 0.979992 0.977064
Unemployment Underlying Long Term -0.828671 -0.931752 -0.869724
Unemployment Natural -0.784707 -0.840642 -0.800247
Real GDP Nominal GDP \
Communication Services 0.981291 0.987339
Consumer Discretionary 0.983483 0.986387
Consumer Staples 0.980323 0.970951
Energy -0.155328 -0.114904
Financials 0.952111 0.964317
Health Care 0.966176 0.968687
Industrials 0.952584 0.965429
Information Technology 0.957551 0.967403
Materials 0.946958 0.959721
Real Estate 0.985343 0.979992
Utilities 0.974638 0.977064
Real GDP 1.000000 0.998468
Nominal GDP 0.998468 1.000000
Unemployment Underlying Long Term -0.940251 -0.926316
Unemployment Natural -0.866446 -0.859056
Unemployment Underlying Long Term \
Communication Services -0.874037
Consumer Discretionary -0.910904
Consumer Staples -0.966750
Energy 0.194390
Financials -0.833426
Health Care -0.919999
Industrials -0.836263
Information Technology -0.855386
Materials -0.828671
Real Estate -0.931752
Utilities -0.869724
Real GDP -0.940251
Nominal GDP -0.926316
Unemployment Underlying Long Term 1.000000
Unemployment Natural 0.951430
Unemployment Natural
Communication Services -0.805601
Consumer Discretionary -0.848124
Consumer Staples -0.897720
Energy -0.054436
Financials -0.785732
Health Care -0.877693
Industrials -0.804078
Information Technology -0.803824
Materials -0.784707
Real Estate -0.840642
Utilities -0.800247
Real GDP -0.866446
Nominal GDP -0.859056
Unemployment Underlying Long Term 0.951430
Unemployment Natural 1.000000
Analysis, Hypothesis Testing, and Machine Learning¶
Our Model Choice Analysis¶
When deciding on which model to use we took into consideration KNN, SVM, SGD, CART, Random Forest, and a Linear/Multivariate Regression. Our analysis for choosing is below.
K-Nearest Neighbors (KNN)
Strengths:
Simplicity: Easy to understand Data Size: Works well with small to medium datasets Interpretability: Easy to interpret and visualizeWeaknesses:
Not Scalable: Not suitable for large datasets because it requires significant memory and computation power Sensitivity: Performance can be decreased by irrelevant data features Distance Metric: Very dependent on the choice of distance metric, meaning the wrong metric can ruin the whole modelSupport Vector Machines (SVM)
Strengths:
Effective with Many Features: Works well when the number of dimensions/features is greater than the number of samples. Memory Efficient Robustness: Because it is effective when the number of features exceeds the number of samples, it is a very consistent and robust modelWeaknesses:
Training Time: Computationally intensive and time-consuming for large datasets Choice of Kernel: Requires careful tuning of the kernel and regularization parameter Lack of Interpretability: Harder to interpret the model compared to simpler ones like linear regressionStochastic Gradient Descent (SGD)
Strengths:
Efficiency: Suitable for a larger scale and very efficient when dealing with large datasets Online Learning: Capable of updating the model incrementally as new data arrives. Perfect in the long run for stock and economic data Flexibility: Can be used with a variety of loss functionsWeaknesses:
Convergence Issues: Can converge to less than optimal solutions or diverge if the learning rate is not properly tuned Needs to be Normalized: The performance is sensitive to the scaling of the input features.Classification and Regression Trees (CART)
Strengths:
Interpretability: Easy to interpret and visualize. Non-linearity: Can capture nonlinear relationships between features No Need for NormalizationWeaknesses:
Overfitting: Prone to overfitting Instability: Small changes in the data can result in a completely different tree. Bias-Variance Tradeoff: Requires careful tuning of tree depth to balance bias and variance.
Random Forest
Strengths:
Robustness: Reduces overfitting by averaging multiple decision trees Handles Missing Data: Can handle missing values effectively if not cleaned (irrelevant to this, though) Versatility: Works well with both classification and regressionWeaknesses:
Complexity: Harder to interpret compared to a single decision tree Computationally Intensive: Requires significant computational resources for large datasetsLinear/Multivariate Regression
Strengths:
Simplicity: Easiest by far interpret and visualize Efficiency: Computationally efficient for small to medium-sized datasets. Assumptions: Works well when the relationship between features and target is approximately linearWeaknesses:
Linearity: Assumes a linear relationship between the features and target, which is not always the case Outliers: Sensitive to outliers, which can skew the results
Further info about many of these models can be found here.
Comparing all of these strengths and weaknesses led us to conclude that the best models for our data set, taking into account its size and structure(number of features and data points), would be a Linear Regression, Random Forest Model, and Stochastic Gradient Descent. Each model covers a different assumption of the data and we believe that this group of 3 covers as many pertinent perspectives with as little redundancy as possible.
Linear Regression¶
We chose to continue with a more in depth trained and tested linear regression model because based on the initial linear regression and the correlation analysis, it was clear that the sectors are linearly related to each other to some degree (i.e. a vast majority of the numbers in the correlation heatmap are near 1 or -1).
#make this into a function so it can be used for everything else
def LinReg(size, sector):
X = sectors_gdp_unemp_df.drop(columns=[sector])
y = sectors_gdp_unemp_df[sector]
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=size, random_state=42)
# Normalize
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Train
lin_reg = LinearRegression()
lin_reg.fit(X_train_scaled, y_train)
# Predictions and evaluation
y_pred = lin_reg.predict(X_test_scaled)
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')
# Plot prediction vs actual
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, edgecolor='k', alpha=0.7)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Predicted vs. Actual Values for ' + sector)
plt.grid(True)
plt.show()
LinReg(0.2, "Consumer Discretionary")
Mean Squared Error: 30.024296579397454
LinReg(0.4, "Real Estate")
Mean Squared Error: 3.7438870909191553
LinReg(0.1, "Health Care")
Mean Squared Error: 4.564097219785389
LinReg(0.2, "Industrials")
Mean Squared Error: 6.675113759988081
LinReg(0.1, "Information Technology")
Mean Squared Error: 0.6960937172195818
LinReg(0.2, "Energy")
Mean Squared Error: 9.589620069450447
LinReg(0.2, "Materials")
Mean Squared Error: 1.9131354533793314
Random Forest¶
We chose to also use a Random Forest Model for each sector because of its robust nature and ability to handle large amounts of features without overfitting. After running the random forest on each sector individually, we found that our model is between 87.5% and 100% accurate, depending on the sector. It is likely, however, that the 100% accurate models may have been mildly skewed by the lower amount of data in that sector.
def RandForest(size, sector):
X = sectors_gdp_unemp_df.drop(columns=[sector])
y = sectors_gdp_unemp_df[sector]
median_value = y.median()
y_binary = (y > median_value).astype(int)
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y_binary, test_size=size, random_state=42)
# Normalize
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
rf_clf = RandomForestClassifier(random_state=42)
rf_clf.fit(X_train_scaled, y_train)
# Predict and evaluate
y_pred = rf_clf.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print(f'Precision: {precision}')
print(f'Recall: {recall}')
print(f'F1-Score: {f1}')
print(f'Confusion Matrix:\n{conf_matrix}')
print(f'Classification Report:\n{class_report}')
# Confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix for ' + sector)
plt.show()
RandForest(0.4, "Consumer Discretionary")
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-Score: 1.0
Confusion Matrix:
[[5 0]
[0 3]]
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 5
1 1.00 1.00 1.00 3
accuracy 1.00 8
macro avg 1.00 1.00 1.00 8
weighted avg 1.00 1.00 1.00 8
RandForest(0.75, "Real Estate")
Accuracy: 0.9333333333333333
Precision: 1.0
Recall: 0.875
F1-Score: 0.9333333333333333
Confusion Matrix:
[[7 0]
[1 7]]
Classification Report:
precision recall f1-score support
0 0.88 1.00 0.93 7
1 1.00 0.88 0.93 8
accuracy 0.93 15
macro avg 0.94 0.94 0.93 15
weighted avg 0.94 0.93 0.93 15
RandForest(0.8, "Health Care")
Accuracy: 0.9375
Precision: 0.8888888888888888
Recall: 1.0
F1-Score: 0.9411764705882353
Confusion Matrix:
[[7 1]
[0 8]]
Classification Report:
precision recall f1-score support
0 1.00 0.88 0.93 8
1 0.89 1.00 0.94 8
accuracy 0.94 16
macro avg 0.94 0.94 0.94 16
weighted avg 0.94 0.94 0.94 16
RandForest(0.5, "Industrials")
Accuracy: 0.9
Precision: 1.0
Recall: 0.8333333333333334
F1-Score: 0.9090909090909091
Confusion Matrix:
[[4 0]
[1 5]]
Classification Report:
precision recall f1-score support
0 0.80 1.00 0.89 4
1 1.00 0.83 0.91 6
accuracy 0.90 10
macro avg 0.90 0.92 0.90 10
weighted avg 0.92 0.90 0.90 10
RandForest(0.7, "Information Technology")
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-Score: 1.0
Confusion Matrix:
[[6 0]
[0 8]]
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 6
1 1.00 1.00 1.00 8
accuracy 1.00 14
macro avg 1.00 1.00 1.00 14
weighted avg 1.00 1.00 1.00 14
RandForest(0.6, "Materials")
Accuracy: 0.9166666666666666
Precision: 0.8571428571428571
Recall: 1.0
F1-Score: 0.9230769230769231
Confusion Matrix:
[[5 1]
[0 6]]
Classification Report:
precision recall f1-score support
0 1.00 0.83 0.91 6
1 0.86 1.00 0.92 6
accuracy 0.92 12
macro avg 0.93 0.92 0.92 12
weighted avg 0.93 0.92 0.92 12
RandForest(0.4, "Energy")
Accuracy: 0.875
Precision: 1.0
Recall: 0.75
F1-Score: 0.8571428571428571
Confusion Matrix:
[[4 0]
[1 3]]
Classification Report:
precision recall f1-score support
0 0.80 1.00 0.89 4
1 1.00 0.75 0.86 4
accuracy 0.88 8
macro avg 0.90 0.88 0.87 8
weighted avg 0.90 0.88 0.87 8
Stochastic Gradient Descent¶
Lastly, our case for using a Stochastic Gradient Descent was the strongest of the models for multiple reasons. The most notable being: SGD’s efficiency with larger data sets SGD’s ability to be updated as new data arrives, which would future-proof the model to some extent if any new stock market/economic factor data gets added
def SGDReg(size, sector):
X = sectors_gdp_unemp_df.drop(columns=[sector])
y = sectors_gdp_unemp_df[sector]
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=size, random_state=42)
# Normalize
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Train
sgd_reg = SGDRegressor(max_iter=100000, tol=1e-3, random_state=42)
sgd_reg.fit(X_train_scaled, y_train)
# Predict and evaluate
y_pred = sgd_reg.predict(X_test_scaled)
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, edgecolor='k', alpha=0.7)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Predicted vs. Actual Values for ' + sector )
plt.grid(True)
plt.show()
SGDReg(0.3, "Consumer Discretionary")
Mean Squared Error: 7.098322600123848
SGDReg(0.45, "Real Estate")
Mean Squared Error: 9.264085113648346
SGDReg(0.2, "Health Care")
Mean Squared Error: 5.91806112583378
SGDReg(0.3, "Industrials")
Mean Squared Error: 2.96630489113503
SGDReg(0.4, "Information Technology")
Mean Squared Error: 2.110729967124219
#Energy Stochastic Gradient Descent
SGDReg(0.4, "Energy")
Mean Squared Error: 4.961598180501632
SGDReg(0.3, "Materials")
Mean Squared Error: 1.6855093502619864
Interpretation: Insight & Policy Decision¶
While conducting our data analysis we came to the conclusion that there are various factors which influence the growth of certain industries stocks more than others. As both our analysis and predictions show, the sectors – Consumer Discretionary, Real Estate, Health Care, Industrials, and Information Technology – are affected more by unemployment while sectors – Energy, Materials, and Industrials – are affected more by GDP. This also relates to our initial idea that unemployment will have an inverse relationship with the sectors it affects while GDP has a linear or same moving trend as the sectors it affects.
We also can identify that we didn't incorporate 4 sectors into our analysis. This was because those sectors were not very affected by the economic evaluations we were looking at. For example, financials are more affected by interest rates compared to GDP and unemployment. Now, this doesn’t mean there’s no relationship between those sectors and our economic indicators. This just shows that the small relationship can be considered statistically insignificant since there's either a slight or no change for each of those sectors.
What we can get from this complete analysis and prediction is that although we can see the S&P 500 moving in a rather positive and growing direction, economic factors, especially dramatic ones, can have long term effects. We can also conclude that it is important to look at other factors, as our Energy data showed a skew because of unforeseen and non predictable events which happened during the timeline of which our data was collected. Although interest rates would have been a relevant factor here, we didn’t include them in our analysis because we wanted to show how one factor can affect the stock prices, so by removing a larger economic factor we were able to easily show this.